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Major  Department:  Anatomy  and  Cell  Biology 

We  have  increased  our  chances  of  isolating  cDNAs  that  code  for  estrogen-induced 

proteins  by  constructing  a  liver  cDNA  library  from  the  poly  A+RNA  of  Fundulus 

heteroclitus  treated  with  estradiol- 17/8.   We  report  cDNAs  coding  for  two  vitellogenins 

(Vtg  I  and  Vtg  II)  and  three  novel  proteins  that  share  identity  with  mammalian  ZP 

proteins.  We  have  designated  the  latter  proteins  as  "choriogenins"  to  highlight  their  role 

as  components  of  the  vitelline  envelope  and  chorion,  yet  to  emphasize  their  site  of 

synthesis  as  being  extra-ovarian,  and  thus  different  from  that  of  the  mammalian  ZP 

proteins. 

Conceptual  translations  of  the  F.  heteroclitus  Vtg  I  and  II  cDNAs  share  60% 
sequence  identity  with  each  other  and  30%  identity  with  other  reported  vertebrate  Vtgs. 
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The  N-terminus  of  a  69  kDa  yolk  protein  matched  the  predicted  N-terminus  of  Vtg  II 
(minus  a  signal  peptide),  verifying  that  Vtg  II  is  expressed  without  being  N-terminally 
blocked.  Six  other  yolk  proteins  were  mapped  to  the  predicted  Vtg  I  sequence, 
confirming  that  Vtg  I  represents  the  major  yolk  protein  precursor.  A  125-kDa  yolk 
protein  that  is  specifically  degraded  during  final  maturation  was  mapped  to  a  region  of 
the  Vtg  I  sequence  that  contained  a  PEST  site,  suggesting  an  explanation  for  its 
preferential  break-down. 

The  three  choriogenins  were  referred  to  as  Chg  500,  Chg  427,  and  Chg  553, 
according  to  the  number  of  amino  acids  predicted  for  each  protein.  Chg  500  and  553 
were  found  to  be  58%  identical  to  a  flounder  "zp  gene  product",  and  30%  identical  with 
the  mouse  ZP1  protein.  Chgs  500  and  553  contain  proline-glutamine-rich  repeating 
regions  that  resemble  a  PXX  motif  reported  in  other  extracellular  matrix  proteins.  Chg 
427  was  found  to  be  67%  identical  to  a  medaka  "L-SF  protein"  and  30%  identitical  to 
the  mouse  ZP3  protein  that  has  been  implicated  as  the  primary  sperm  receptor.  Besides 
reporting  the  sequences  of  five  hepatically-derived  proteins  that  contribute  to  the 
development  of  the  ovarian  follicle,  we  emphasize  that  the  estrogen-induced  library  is  an 
excellent  strategy  to  screen  for  reproductively  significant  cDNAs. 
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CHAPTER  1 
GENERAL  INTRODUCTION 

The  Demands  of  the  Germ  Cell 

Reflecting  on  reproductive  strategies  of  vertebrates,  I  am  reminded  of  a  once 
familiar  phrase  used  by  an  automobile  repair  shop:  "  You  can  pay  me  now... or  you  can 
pay  me  later."  This  is  a  fitting  slogan,  I  think,  to  describe  two  alternative  relationships 
between  germ  cells  and  somatic  cells,  as  manifested  by  different  vertebrate  groups. 
Although  the  germ  cells  of  all  vertebrates  are  bound  to  receive  an  investment  from  their 
associated  somatic  cells,  this  investment  can  be  delivered  either  sooner  or  later  according 
to  the  specific  developmental  programs.  As  adults,  ourselves,  we  may  consider  the 
investment  made  by  mothers  to  their  young  as  an  opportunity  that  is  chosen  by  the 
mother,  voluntarily.  However,  this  point  of  view  has  been  described  by  some  as  "adult 
chauvinism,"  biased  toward  the  attitudes  and  experiences  of  the  adult  (Wallace,  1983). 
An  alternative  view  would  be  that  the  mother,  or  the  somatic  cells,  are  essentially  held 
captive  by  the  germ  cells,  and  (if  healthy)  have  no  choice  but  to  respond  when  called 
upon  for  support.  As  an  illustration,  consider  the  physiological  state  of  the  mummichog, 
Fundulus  heteroclitus  (Fig  1.1).  When  the  days  of  winter  begin  to  grow  long,  and  the 
water  temperature  rises,  the  female  mummichog  does  not  have  much  say  in  the  matter, 
but  her  ovary  begins  to  grow  dramatically,  mainly  by  the  incorporation  and  storage  of 
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Figure  1.1  The  mummichog,  Fundulus  heteroclitus ,  an  estuarine  teleost  of  the 

Order  Cyprinodontiformes  as  drawn  by  Lynn  Milstead  of  the 
Whitney  Laboratory.  The  top  fish  is  the  female;  the  bottom  fish, 
displaying  more  pigment  is  the  male. 
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yolk  by  the  oocytes  (vitellogenesis)  (Taylor  et  al.,  1977;  Hsiao  et  al.,  1994).  The  origin 
of  the  yolk  proteins  can  be  traced  to  a  cascade  of  events  resulting  in  the  maternal  liver 
synthesizing  a  suite  of  secreted  proteins,  primarily  consisting  of  the  yolk  precursor, 
vitellogenin  (Vtg),  but  also  containing  riboflavin-  and  vitamin-  binding  proteins  (White, 
1987;  White  and  Merrill,  1988)  and  most  recently  discovered,  precursors  of  the  vitelline 
envelope  (Hamazaki  et  al.,  1985;  Murata,  et  al.,  1991;  Hyllner  et  al.,  1991) .  Thus,  the 
oocytes,  or  germ  cells,  demand  an  investment  by  the  maternal  or  somatic  cells.  They 
are  saying,  "Pay  me  now."  This  extensive  investment  begins  long  before  fertilization, 
without  the  adult  knowing  whether  the  eggs  will  actually  ever  be  spawned  or  fertilized. 
Once  the  oocytes  are  expelled,  the  female,  having  already  surrendered  a  substantial 
amount  of  energy  and  material,  is  relieved  of  any  further  investment  (until  the  next  clutch 
of  oocytes  begins  its  demands). 

On  the  other  hand,  in  mammals  the  germ  cells  present  more  of  a  "Pay  me  later" 
scenario.  Mammalian  oocytes  appear  to  not  receive  any  yolk  at  all,  with  synthesis  of 
vitellogenin  presumed  (but  not  proven)  to  be  totally  nonexistent  in  mammals  (except  in 
the  egg  laying  monotremes)  (Eckelbarger,  1994).  The  investment,  then,  comes  mainly 
after  fertilization,  with  support  and  nourishment  provided  first  by  a  modification  of  the 
uterus  into  the  chorionic  villi,  and  secondly  through  lactation,  where  protein  nourishment 
continues  to  be  demanded  by  the  progeny,  and  thus  supplied  by  the  adult. 

The  work  contained  in  this  dissertation  provides  an  example  of  the  "Pay  me  now" 
demands  of  the  oocyte  on  its  somatic  surroundings.  We  provide  evidence  of  at  least  five 
distinct  proteins  that  are  made  by  the  maternal  liver,  in  response  to  estradiol,  and 


A  transmission  electron  micrograph  providing  an  ultrastructural  view  of 
the  environment  surrounding  the  oocyte  membrane.  To  the  bottom  left  is 
the  cytoplasm  of  the  oocyte  including  a  yolk  sphere  (arrow)  containing 
processed  yolk  proteins,  derived  from  Vtg.  Distal  to  the  oocyte 
membrane  is  the  stratified  appearance  of  the  vitelline  envelope  (bracket), 
containing  components  derived  from  the  choriogenins.  This  micrograph 
was  kindly  provided  by  Kelly  Selman  (X  12,200). 
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transported  to  the  ovary,  to  be  used  by  the  germ  cells  and  their  descendants.  Two  of 
these  proteins,  Vtg  I  and  Vtg  II,  are  endocytosed  by  the  oocyte,  processed,  and  stored 
as  yolk  (Fig.  1.2),  mainly  to  be  used  as  a  nutrient  source  by  the  developing  embryo. 
The  three  remaining  proteins,  designated  the  choriogenins  (Chgs),  are  also  synthesized 
by  the  estrogen-induced  liver,  and  transported  to  the  ovary.  However,  rather  than  being 
endocytosed,  the  Chgs  are  laid  down  as  an  extracellular  matrix  between  the  oocyte  and 
follicle  cells  forming  the  vitelline  envelope  (Fig.  1.2,  in  brackets). 

The  Original  Emphasis:  Vitellogenins 

One  of  the  initial  goals  of  this  project  was  to  establish  a  definitive  precursor- 
product  relationship  between  vitellogenin  and  the  processed  yolk  proteins.  It  was  decided 
that  primary  sequence  information  would  be  needed  for  this  goal  and  that  the  best  method 
to  gain  the  amino  acid  sequence  of  vitellogenin  was  to  use  a  molecular  approach,  produce 
a  cDNA  library,  screen  for  Vtg  with  degenerate  primers  designed  from  yolk  proteins, 
and  sequence  the  cDNA  clone.  Before  the  lengthy  Vtg  sequence  was  completed,  the 
original  research  team  disbanded.  I  subsequently  joined  the  Wallace  lab  and  thereby 
"inherited"  the  Vtg  sequencing  project.  Influenced  by  the  dissertation  of  Byrne  (1989) 
describing  the  evolution  of  yolk  proteins,  I  became  interested  in  the  evolutionary  aspects 
of  Vtg,  particularly  in  the  independently  evolving  phosvitin  domain.  The  lack  of  a 
phosvitin  domain  in  the  Caenorhabditis  elegans  Vtgs  (Speith  et  al.,  1991)  prompted  the 
idea  that  phosvitin  may  be  an  exclusively  vertebrate  inclusion  within  the  Vtg  gene  (Byrne 
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et  al.,  1989).  We  theorized  that  an  interesting  oviparous  model  may  be  provided  by  the 
protochordate  Branchiosotma  floridae,  the  Florida  lancelet. 

The  original  aims  of  my  project  were,  thus,  to  complete  the  F.  heteroclitus  Vtg 
cDNA  sequence,  and  thereafter  use  the  piscine  cDNA  as  a  heterologous  probe  to  isolate 
phylogenetically  primitive  Vtgs.  I  succeeded  in  the  former  goal,  and  the  results  of  that 
work  are  provided  in  Chapter  2.  I  began  screening  a  cDNA  library  synthesized  from  the 
MRNA  of  spawning  female  amphioxus,  B.  floridae,  by  a  PCR-based  method  that  utilized 
degenerate  primers  designed  by  aligning  the  currentiy  known  Vtg  protein  sequences. 
Before  long,  a  new  Vtg  cDNA  was  successfully  isolated.  However,  the  new  Vtg  was 
isolated  from  the  "control"  F.  heteroclitus  library  template,  rather  than  the  targeted 
amphioxus  library  template  (Fig.  1.3).  At  that  time,  no  two  Vtgs  from  one  vertebrate 
species  had  yet  been  completely  sequenced,  and  so  this  appeared  to  be  a  worthwhile 
challenge.  Additionally,  the  sequence  of  two  F.  heteroclitus  Vtgs  would  provide 
information  presumably  necessary  to  continue  mapping  out  the  precursor-product 
relationships  of  the  Vtgs  and  the  yolk  proteins.  As  a  result,  phylogenetic  aspects  of  Vtg 
evolution  were  shelved  in  order  to  consider  the  variations  of  Vtg  that  might  be 
encountered  from  within  one  species,  F.  heteroclitus.  The  second  primary  aim  of  my 
project  was  thus  to  complete  the  Vtg  II  cDNA;  this  data  is  provided  in  Chapter  3. 

While  completing  the  two  Vtg  cDNA  sequences,  N- terminal  sequences  of  the  yolk 
proteins  were  also  being  obtained.  This  work  represented  a  collaborative  effort  that 
included  data  collected  from  three  students  at  the  Whitney  lab  (including  myself)  plus  a 
considerable  effort  by  the  ICBR  Protein  Core  facility.  Eventually  we  established  a 
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scheme  mapping  out  the  specific  processing  of  two  Vtgs  into  several  separate  yolk 
protein  products.  We  have  further  submitted  a  hypothesis  implicating  a  PEST  site  found 
within  the  predicted  YP  125  sequence  as  a  possible  factor  influencing  its  extensive 
degradation.  This  study  is  presented  in  Chapter  4. 

A  New  Emphasis:  Estrogen-Induced  Reproductive  Proteins 

While  completing  the  Vtg  II  cDNA  sequence,  using  a  PCR-based  screening 
method,  other  non-target  cDNAs  were  often  isolated.  This  is  a  common  phenomenon 
in  cloning  that  is  usually  dismissed  as  misfortune.  However,  because  our  template  was 
an  estrogen-induced  cDNA  library,  the  non-target  cDNAs  stood  a  likely  chance  of 
representing  reproductively  significant  molecules.  This  was  exactly  the  case  concerning 
the  Chgs.  All  three  of  the  Chgs  cDNAs  were  isolated  by  a  fortuitous  mis-priming  event 
that  occurred  while  screening  for  Vtg  II  cDNAs  (Fig.  1.3).  Only  recently  had  a 
hypothesis  been  submitted  that  ascribed  the  origin  of  the  major  proteins  of  the  teleost 
vitelline  envelope  to  the  estrogen-induced  liver  (Hamazaki  et  al.,  1987b).  This  ran 
counter  to  the  mammalian  literature  that  had  established  the  oocyte  as  the  primary  site 
of  synthesis  for  the  proteins  of  the  mammalian  zona  pellucida  (Wassarman,  1988a). 
Nevertheless,  our  data  verifies  that  several  Chgs  are  in  fact  synthesized  by  the  liver, 
transported  to  the  ovary,  and  laid  down  as  the  vitelline  envelope  between  the  oocyte  and 
the  follicle  cells.  These  proteins  have  been  referred  to  by  several  names.  When  isolated 
from  the  ovarian  follicle,  they  are  usually  called  vitelline  envelope  proteins 
(VEPs)(Hyllner  et  al. ,  1991).  Another  nomenclature  based  on  isolating  the  proteins  from 
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Vtg  II  4650     CTGGATGAGAGGCCAGACGTGTGGGC7CTGCGGAAAGGCCGACGGGGAAGTCAGACAGG  4693 

*****   ********   **  *** 

TGTGG I CTCTGCGG I AAIAACGA 
ROW  19  (degenerate)  C  G    T      CG  T 


Chg  500  913       TCCTGGACCTC7GCGTGTGGAGCTCAGGCTTGGGAATGGAGAGTGTTCTGTCAAGGGTT  975 

********   **         *  ** 

ROW  45  GAGCTCAGTCTGTACACTGCT 


Chg  427  669       CAGCCTTCCTCTGGATCCCCTTTGGGTCCCATTCTCTGCAG7TAAGATGGCTGAGGAGT  698 

******  *  **** 

ROW  55  CATT  CTG AAACTTG AAG AC  C  C 


Chg  553  320       CTCATTGTTGGGAGGAGGTCAAGGCTGTACACATGTTGACCCCAATTCACTTTTTGCCA  373 

***   ***   *   ********  * 

ROW  45  GAGCTCA-GTCTGTACACTGCT 


Figure  1.3  Four  accounts  of  fortuitous  annealing  that  resulted  in  the  eventual 
isolation  of  cDNAs  coding  for  Vtg  II,  Chg  500,  Chg  427,  and  Chg 
553.  Vtg  II  was  discovered  using  the  degenerate  primer  ROW  19. 
Chg  500  and  Chg  553  were  discovered  using  ROW  45,  originally 
designed  for  annealing  to  Vtg  II.  Chg  427  was  isolated  using 
ROW  55,  also  designed  to  anneal  to  Vtg  II. 


the  blood  of  spawning  females  used  the  terms  "low  molecular  weight  spawning  female 
specific  substance  (L-SF),  and  high  molecular  weight  spawning  female-specific  substance 
(H-SF)"  (Hamazaki  et  al.,  1987a).  Still  other  groups  that  concentrated  on  sequence 
identity  between  their  teleost  proteins  and  the  published  mammalian  ZP  proteins,  referred 
to  their  sequences  as  teleost  ZPs  (Lyons  et  al.,  1993).  We  designated  the  cDNAs  and 
coded  proteins  described  here  as  choriogenins  (Chgs),  precursor  proteins  of  the  vitelline 
envelope  and  chorion.  We  feel  that  this  name  accentuates  the  role  of  these  molecules  as 
structural  components  of  the  vitelline 

envelope  and  chorion,  yet  emphasizes  their  origin  as  being  extra-ovarian  and  thus 
different  from  the  homologous  ZP  proteins  of  mammals.  In  Chapter  5,  we  present  the 
cDNA  and  protein  sequences  of  three  Chgs,  as  well  as  a  partial  characterization  of  F. 
heteroclitus  VEPs.  The  Chg  data  represent  the  most  novel  aspect  of  the  dissertation, 
with  the  hypothesis  of  liver-derived  vitelline  envelope  components  still  fairly  recent.  One 
of  the  remaining  paradoxes  presented  by  the  Chg  sequences  is  the  comparative  disparity 
between  the  mammalian  and  teleostean  systems  for  producing  the  extracellular  matrix  that 
surrounds  the  oocyte.  Because  the  sequence  identity  between  the  Chgs  and  mammalian 
ZP  proteins  suggests  an  ancestral  relationship,  the  differences  in  gene  regulation,  site  of 
synthesis,  and  functional  roles  offer  a  wealth  of  interesting  questions  for  future 
investigations. 

By  providing  the  structure  of  five  previously  unsequenced  molecules  that 
contribute  to  the  architecture  of  the  ovarian  follicle,  we  have  contributed  to  our 
understanding  of  reproductive  processes  in  F.  heteroclitus.  However,  we  are  even  more 
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impressed  by  the  remarkable  resource  proven  to  lie  within  the  estrogen-induced  liver 
library  that  was  used  to  isolate  these  cDNAs.  Rather  than  being  the  means  to  an  end, 
the  library  has  rather  been  venerated  as  possibly  the  most  important  attribute  of  the 
project.  We  expect  that  other  estrogen-induced  liver  products  can  be  easily  isolated  from 
it,  and  modifications  of  this  strategy  can  be  used  in  the  future  to  investigate  other 
inductive  hormone  effects  on  other  tissues. 


CHAPTER  2 
FUNDULUS  HETEROCLITUS  VITELLOGENIN: 
THE  DEDUCED  PRIMARY  STRUCTURE  OF  A  PISCINE  PRECURSOR  TO 
NON-CRYSTALLINE,  LIQUID-PHASE  YOLK  PROTEINS 


Introduction 

Vitellogenin  (Vtg)  is  a  large  phosphoglycoprotein  (  —  200  Kda)  used  by  most 
oviparous  animals  as  a  maternally  derived  yolk  precursor  (Pan  et  al.,  1969;  Kunkel  and 
Nordin,  1985;  Wallace,  1985;  Selman  and  Wallace  1989).  It  is  synthesized  by  either  the 
liver  (vertebrates),  fat  body  (insects),  or  intestine  (nematodes)  under  hormonal  induction 
and  transported  to  growing  oocytes  via  the  blood  (Flickinger  and  Rounds,  1956;  Wallace 
and  Jared,  1969).  Vtg  is  incorporated  into  oocytes  by  receptor- mediated  endocytosis 
(Opresko  et  al.,  1980;  Opresko  and  Wiley,  1987;  Shen  et  al.,  1993)  and  is  stored  for 
later  use  by  the  developing  embryo  (Flickinger,  1960;  Yamagami,  1960;  Karasaki, 
1963b;  Selman  and  Pawsey,  1965;  Murakami  et  al.,  1990).  Once  inside  the  oocyte,  Vtg 
is  processed  into  smaller  yolk  proteins  consisting  of  lipovitellins  (Lvl  and  Lv2), 
phosvitins  (Pv),  and  phosvettes,  that  may  in  turn  be  degraded  into  even  smaller  cleavage 
products  (Flickinger  and  Rounds,  1956;  Taborsky,  1967;  Wallace  and  Selman,  1985; 
Gerber-Huber  et  al.,  1987;  Greeley  et  al.,  1986). 
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The  now  familiar  Vtg  gene  family  (Wahli  et  al.,  1979,  1991;  Tata  et  al.,  1980; 
Blumenthal  et  al.,  1984;  Nardelli  et  al.,  1987;  Byrne  et  al.,  1989;  Speith  et  al.,  1991) 
encompasses  Vtgs  synthesized  by  a  wide  range  of  metazoans  including  the  nematode 
Caenorhabditis  elegans  (Speith  et  al.,  1985),  the  boll  weevil  Anthonomus  grandis 
(Trewitt  et  al.,  1992),  the  silkworkm  Bombyx  mori  (Yano  et  al.,  1994),  the  mosquito 
Aedes  aegypti  (Chen  et  al.,  1994)  the  cyclostome  Ichthyomyzon  unicuspis  (Sharrock  et 
al. ,  1992),  the  anuran  Xenopus  laevis  (Germond  et  al. ,  1984;  Gerber-Huber  et  al. ,  1987), 
and  the  chicken  Gallus  domesticus  (van  het  Schip  et  al. ,  1987).  Additionally,  two  human 
cDNAs,  those  encoding  von  Willebrand  factor  (-250  kDa)  (Baker,  1988a)  and 
apolipoprotein  B-100  (-510  Kda)  (Baker,  1988b),  have  also  been  reported  as  distantly 
related  members  of  the  Vtg  gene  family.  Exceptions  to  a  Vtg-derived  yolk  precursor 
system  have  been  reported  in  at  least  two  dipteran  species:  Drosophila  melanogaster 
(Hovemann  et  al.,  1981)  and  Ceratitis  capitata  (Rina  and  Savakis,  1991)  where  yolk 
precursors,  often  called  Vtgs,  do  not,  in  fact,  share  significant  sequence  identity  with  the 
"Vtg  gene  family"  setting  a  precedent  for  the  use  of  alternative  molecules  in  the 
production  of  yolk  (Terpestra  and  AB,  1988;  Bownes,  1992). 

A  large  component  of  vertebrate  Vtgs,  the  Pv  region,  was  found  to  be  nonexistent 
in  both  C.  elegans  and  the  boll  weevil  Vtg  (Nardelli  et  al.,  1987;  Trewitt  et  al.,  1992), 
inspiring  the  notion  that  Pv  was  an  element  unique  to  vertebrate  Vtgs.  The  apparent 
absence  of  the  Pv  region  from  invertebrate  Vtgs  (see  Discussion),  along  with  studies 
documenting  the  ability  of  the  phosphate  groups  of  Pv  to  bind  and  transport  large 
amounts  of  divalent  cations,  especially  Ca++  (Urist  et  al.,  1958;  Urist  and  Schjeide, 
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1961;  Taborsky,  1980),  have  led  to  the  speculation  that  Pv  may  be  important  in 
embryonic  bone  formation  (Mecham  and  Olcott,  1949;  Rabinowitz,  1962;  Taborsky, 
1974;  Lange,  1981;  Wallace  and  Begovac,  1985;  Nardelli  et  al.,  1987;  Byrne  et  al., 
1989).  Of  additional  interest  is  the  hypothesis  that  evolutionary  changes  in  the  Pv  region 
have  occurred  at  a  faster  rate  than  in  the  two  flanking  regions,  Lvl  and  Lv2  (Byrne  et 
al.,  1989).  To  address  comparative  and  evolutionary  questions  about  Vtg,  we  sought  to 
characterize  a  Vtg  cDNA  that  was  phylogenetically  intermediate  to  the  meager  collection 
of  currently  reported  sequences.  Complete  Vtg  sequences  from  the  superclass 
Gnathostomata  have  been  reported  from  only  two  tetrapods  (Xenopus  and  chicken) 
leaving  several  entire  lower  vertebrate  classes  unrepresented.  Since  at  least  half  of  all 
vertebrates  are  contained  within  the  subclass  Teleostei  (Nelson,  1984),  the  absence  of  a 
teleostean  Vtg  sequence  leaves  a  substantial  gap  in  our  understanding  of  Vtg  evolution, 
diversity,  and  function. 

For  the  present  study,  we  chose  as  a  model  the  estuarine  teleost,  Fundulus 
heteroclitus,  which  possesses  a  non-specialized  body  plan  with  a  fairly  typical 
reproductive  system,  in  the  hopes  of  obtaining  a  piscine  Vtg  that  could  be  considered  as 
representative  of  most  teleosts.  Much  work  has  already  been  reported  on  F.  heteroclitus 
describing  vitellogenesis  (Wallace  and  Selman,  1978,  1981;  Selman  and  Wallace,  1983; 
Kanungo  et  al.,  1990),  the  resulting  yolk  proteins  (Wallace  and  Begovac,  1985;  Wallace 
and  Selman,  1985;  Greeley  et  al.,  1986),  and  oocyte  maturation  (Wallace  and  Selman, 
1978,  1980).  Besides  the  advantages  of  F.  heteroclitus  possessing  many  typical 
teleostean  traits,  there  are  at  least  two  characteristics  of  its  yolk  that  presented  additional 
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motivation  for  our  comparative  analyses.  First,  the  yolk  proteins  of  F.  heteroclitus 
oocytes  and  eggs  remain  in  a  liquid  form  throughout  oocyte  growth  and  maturation 
(Wallace  et  al.,  1966;  Wallace  and  Begovac,  1985);  this  is  in  marked  contrast  to  the 
more  typical  observation  that  vertebrate  yolk  proteins  are  organized  into  a  specific 
crystalline  lattice  as  was  reported  in  lamprey  (Karasaki,  1967;  Raag  et  al.,  1988), 
sturgeon  (Lange  and  Kilarski,  1986),  several  amphibians  (Karasaki,  1963a),  and  the 
reptile,  tuatuara  (Lange  and  Kilarski,  1986;  reviews  by  Lange,  1985,  Banaszak  et  al., 
1991).  Second,  whereas  Xenopus  and  chicken  yolk  remains  in  the  form  of  three  primary 
Vtg  cleavage  products,  Lvj,  Pv,  and  Lv2  plus  a  few  minor  peptides  or  phosvettes  (Wiley 
and  Wallace,  1981;  Wallace  and  Morgan,  1986a,  1986b;  Wallace  et  al.  1990),  F. 
heteroclitus  yolk  proteins  undergo  substantially  more  processing,  resulting  in  a  complex 
suite  of  smaller  Vtg-derived  cleavage  products  (Wallace  and  Begovac,  1985;  Wallace  and 
Selman,  1985;  Greeley  et  al.,  1986).  We  hoped  that  by  obtaining  the  primary  structure 
of  a  teleostean  Vtg  we  would  not  only  confirm  regions  that  are  ubiquitously  conserved 
among  oviparous  organisms,  but  would  also  reveal  novel  sequence  differences  that  play 
a  role  in  the  yolk  processing  events  unique  to  F.  heteroclitus. 

In  this  paper  we  present  the  predicted  primary  structure  of  F.  heteroclitus  Vtg. 
By  aligning  the  F.  heteroclitus  Vtg  sequence  to  other  vertebrate  Vtgs,  we  found  that  the 
most  significant  differences  occurred  within  the  polyserine  domain.  These  differences 
may  account  for  some  of  the  molecular  phenomena  specifically  associated  with  F. 
heteroclitus  yolk,  such  as  the  perpetuation  of  a  liquid  phase  yolk  in  both  oocytes  and 
eggs,  or  the  substantial  amount  of  proteolytic  processing  which  occurs  in  the  growing 
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oocytes.  Although  the  polyserine  domain  is  indeed  a  polymorphic  region,  a  conserved 
genetic  pattern  (Byrne  et  al.,  1984,  1989)  persists  in  all  of  the  vertebrates  thus  far 
examined:  TCX  repeats  at  the  5'  end  and  a  larger  group  of  AGY  repeats  towards  the  3' 
end,  suggesting  an  ancient  origin  of  the  linkage  between  these  two  clusters  of 
trinucleotide  repeats. 

Materials  and  Methods 

Chemicals 

Estradiol- 17/3  was  obtained  from  Sigma  Chemical  Co.  (St.  Louis,  MO). 
Radioisotopes,  [a-32P]  dCTP  and  [a-35S]  dATP,  were  purchased  from  New  England 
Nuclear  (Boston  MA).  Lambda  gtlO  vector  and  cDNA  synthesis  reagents  were  obtained 
from  Promega  (Madison,  WI).  The  subcloning  plasmid  pGem-3Z  was  purchased  from 
Promega,  pT7BLUE  from  Novagen  (Madison,  WI),  and  pCRlOOO  from  Invitrogen  (San 
Diego,  CA).  All  sequencing  gels  were  cast  using  Sequagel-8  (National  Diagnostics, 
Atlanta)  polyacrylamide  reagents.  Amplification  reactions  were  performed  using 
Thermophilus  aquaticus  DNA  polymerase  (Promega).  Sequenase  version  2.0  DNA 
polymerase  and  dideoxy  sequencing  reagents  were  obtained  from  US  Biochemicals 
(Cleveland,  OH).  Reagents  for  random-primed  labeling  of  probes  were  purchased  from 
Pharmacia  (Piscataway,  NJ).  Both  Nytran  nylon  and  S&S  NC  nitrocellulose  transfer 
membranes  were  purchased  from  Schleicher  and  Schuell  (Keene,  NH).  Amino  acid  N- 
terminal  sequencing  and  synthesis  of  oligonucleotide  primers  were  performed  by  the 
University  of  Florida  Interdisciplinary  Center  for  Biotechnology  Research  core  facility. 
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Induction  of  vitellogenin  synthesis 

Male  Fundulus  heteroclitus  were  collected  from  the  estuarine  creeks  adjacent  to 
the  Whitney  Laboratory,  and  were  maintained  in  running  seawater  tanks  under  14L:  10D 
photoperiod  conditions  at  25  ±  2°C.  Fish  were  maintained  for  at  least  one  month  before 
being  used  for  RNA  collections. 

In  order  to  increase  the  proportion  of  Vtg  RNA  within  the  total  RNA  pool, 
vitellogenin  synthesis  was  artificially  induced  in  six  males  (8-10  g  body  weight)  by  two 
intraperitoneal  injections  of  estradiol- 17/?  (0.01  mg/g  body  weight)  dissolved  in  50  yl 
peanut  oil  (Kanungo  et  al.,  1990).  Five  control  males  were  sham-injected  with  peanut 
oil  alone.  The  first  injection  was  performed  on  day  1,  the  second  injection  on  day  4, 
followed  by  sacrifice  and  liver  dissection  on  day  8. 

Isolation  of  liver  poly  A -I-  RNA 

Livers  from  both  groups  of  fish  were  collected  and  immediately  placed  in  0°C 
guanidinium  thiocyanate  solution  (5M  guanidinium  thiocyanate,  50  mM  Tris-Hcl,  25  mM 
EDTA,  8%  v/v  mercaptoethanol,  pH  7.4)  and  homogenized  by  one  thirty-second 
polytron  (Brinkman)  blast.  RNA  was  isolated  by  the  guandidinium  thiocyanate  method 
according  to  MacDonald  et  al.  (1987).  One  gram  of  liver  from  estrogen-treated  fish 
yielded  an  average  of  0.536  mg  RNA,  with  an  average  O.D.  260/280  ratio  of  2.03, 
while  a  gram  of  liver  from  sham-treated  fish  yielded  an  average  of  0.318  mg  RNA  with 
an  O.D.  260/280  of  1.87.  Total  RNA  samples  were  combined  into  two  pools:  one 
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from  six  estrogen-treated  males  and  the  other  from  five  sham-treated  males. 

Oligo-dT  cellulose  chromatography  was  used  to  isolate  poly  A+  RNA  from  the 
two  initial  pools  of  total  RNA  (Aviv  and  Leder,  1972).  Of  the  2. 1  mg  total  RNA  from 
estrogen-treated  fish,  46.3  fig  poly  A+  RNA  was  recovered  (2.2%  recovery).  Poly  A+ 
RNA  from  both  experimental  and  control  fish  was  analyzed  by  northern  blot  analysis  to 
verify  that  Vtg  transcripts  were  included  in  the  poly  A+  RNA  fraction.  The  poly  A+ 
RNA  was  dissolved  in  deionized  glyoxal/DMSO  (1:1)  and  electrophoresed  through  an 
agarose  gel  (McMaster  and  Carmichael,  1977)  and  transferred  by  capillary  action  onto 
a  nylon  membrane.    The  membrane  was  probed  with  a  32P  end-labeled  17-mer 
oligonucleotide,  MB6  (degeneracy  =  32)  which  was  designed  from  the  N-terminal  amino 
acid  sequence  of  a  small  yolk  peptide  isolated  from  F.  heteroclitus  oocytes:  His-Lys- 
Lys-Met-Val-Ala.  Autoradiography  of  northern  blots  revealed  an  MB6  positive,  ~  6  kb 
transcript  found  in  the  estrogen-treated  fish  which  was  absent  in  sham-treated  male  fish. 
This  transcript  size  was  consistent  with  Vtg  cDNA  previously  reported  from  chicken 
(Cozens  et  al.,  1980;  Arnberg  et  al.,  1981;  van  het  Schip  et  al.,  1987),  frog  (Whali  et 
al.,  1979),  and  rainbow  trout  (Le  Guellec  et  al.,  1988). 

cDNA  library  construction,  screening,  and  sequencing 

Synthesis  of  cDNA  was  performed  by  annealing  2  /xg  poly  A+  RNA  with  oligo 
dT  primers,  and  using  AMV  reverse  transcriptase  and  T4  DNA  pol  I  for  first  and  second 
strand  synthesis  respectively.  Eco  Rl  adapters  were  ligated  to  the  two  ends  of  the  cDNA 
transcripts  using  T4  DNA  ligase  (these  Eco  Rl  adapters  were  later  found  to  have  become 
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compromised).  After  phosphorylation  of  adapter  ends,  the  cDNA  transcripts  were  ligated 
into  the  bacteriophage  vector  X  gtlO  (Promega).  Once  the  X  particles  were  packaged, 
the  primary  library  was  plated  using  host  E.  coli  strain  C600HFL,  resulting  in  an  initial 
library  titer  of  6  x  104  total  plaque  forming  units. 

Two  24.5  cm2  petri  dishes  were  used  for  plating  phage-transfected  cells 
(-400,000  total  plaques).  Plaques  were  lifted  onto  nylon  membranes.  Hybridization 
was  performed  at  39°C  using  IX  Denhardt's  solution  (Denhardt,  1966),  6X  SSC  (150 
Mm  NaCl  and  15  Mm  sodium  citrate,  pH  =  7)  with  the  same  end-labelled 
oligonucleotide  probe  as  described  earlier,  MB6.  The  primary  screening  resulted  in  30% 
of  the  plaques  testing  positive  for  the  degenerate  MB6  probe.  By  following  four  plaque 
clones  (X5,  X20,  X21,  XI 6)  through  two  more  rounds  of  positive  screening,  four  final  X 
clones  were  set  aside  for  subcloning.  The  clone  (X21)  containing  the  largest  insert 
(~5000  bp)  was  subjected  to  endonuclease  digestion  with  EcoRl,  which  was  expected 
to  free  the  entire  cDNA  insert.  Unfortunately  the  EcoRl  sites  had  inadvertently  been 
modified  so  that  when  digested  with  EcoRl,  one  end  of  the  insert  remained  attached  to 
the  vector.  Alternatively,  the  enzymes  Hindlll  and  BgM,  which  like  EcoRl  cleave  at 
rare  sites,  were  used  to  digest  X21.  Two  large  fragments  were  released  (1900  bp  from 
EcoRllBgtll  and  a  2060  bp  fragment  from  EcoRl  I  Hindlll)  and  these  were  subcloned  into 
the  sequencing  plasmid  PGEM  3Z  resulting  in  subclones  pMMB6  and  pMMBl, 
respectively.  Because  the  size  of  these  two  fragments  did  not  add  up  to  the  total  putative 
insert  size  (6000  bp),  digestion  of  another  clone,  X5,  was  performed  in  order  to  provide 
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an  overlapping  sequence.  Digestion  with  Hindlll  and  EcoRl  yielded  a  third  fragment, 
1610  bp,  which  was  subcloned  as  pMMB9. 

Dideoxynucleotide  chain  termination  sequencing  of  these  three  clones  revealed 
that  there  were  two  remaining  nucleotide  stretches  that  were  needed  to  complete  the 
entire  cDNA:  a  small  5'  portion  which  included  the  initial  methionine  codon  and  a  ~  300 
bp  overlap  between  pMMB9  and  pMMBl.  Both  of  these  additional  portions  were 
retrieved  from  the  cDNA  library  by  PCR  techniques.  First,  the  initiating  methionine  was 
retrieved  by  using  an  exact  forward  primer  (NEB  #1231)  complementary  to  the  XgtlO 
primer  adapter  sequence  and  an  exact  reverse  primer,  ROW  1,  195  base  pairs  internal 
to  the  existing  5'  end.  The  resulting  product  was  gel-purified  and  ligated  into  the 
sequencing  plasmid  pT7BLUE  by  the  T/A  cloning  method. 

The  overlap  between  pMMB9  and  pMMBl  was  retrieved  in  a  similar  fashion  by 
using  two  exact  internal  primers,  ROW  12  and  ROW  13,  made  according  to  the  existing 
ends  of  PMMB9  and  PMMBl.  The  resulting  product  was  gel-purified  and  ligated  into 
a  similar  T/A  plasmid,  pCRlOOO.  These  two  PCR  inserts  were  sequenced  and  found  to 
overlap  with  the  already  existing  sequence  resulting  in  a  5112  bp  open  reading  frame 
from  which  we  have  deduced  the  complete  primary  structure  of  the  putative  Fundulus 
heteroclitus  Vtg  polypeptide. 

Sequence  analysis 

Sequencing  data  were  organized  and  examined  using  PC\GENE  software 
(Intelligenetics,  Mountain  View,  CA)  including  the  following  analyses:  predictions 
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Figure  2.1  Cloning  strategy  used  in  isolating  Fundulus  heteroclitus  Vtg 
cDNA.  Lambda  gtlO  bacteriophage  clones  #5  and  #21  were 
isolated  by  tertiary  screening  with  degenerate  17mer,  MB6. 
pGEM  3Z  subclones  (pMMB6  and  pMMBl)  were  constructed 
from  digestion  products  of  X21  and  pMMB9  originated  from  X5. 
Two  remaining  sections,  pGL3  and  pGL5,  were  isolated  by 
anchored  PCR,  using  a  5-{A  aliquot  of  the  cDNA  library  as 
template  and  exact  primers,  and  then  inserted  into  pTTBlue'  and 
pCRlOOO  vectors,  respectively. 
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of  post-translational  modification  sites  by  PROSITE,  signal  peptide  prediction  by 
PSIGNAL,  antigenic  determinant  analysis  using  ANTIGEN,  codon  usage  statistics  by 
CDUSAGE. 

Protein  sequence  alignments  were  performed  using  two  programs:  ClustalV 
(Higgins  et  al.,  1992),  utilizing  the  PAM  250  matrix,  gap  penalty =3,  K-tuple=l,  no. 
of  top  diagonals=5;  window  size=5)  for  the  multiple  alignment,  and  ALIGN  Plus  (S&E 
Software,  State  Line,  PA)  for  pairwise  alignments.  To  normalize  domain  comparisons, 
we  defined  a  "polyserine  domain"  within  the  Vtg  sequence  by  choosing  two  well-aligned 
termini  as  the  exterior  boundaries,  thereby  including  all  of  the  poorly-aligned  polyserine 
tracts  on  the  interior.  Because  we  do  not  have  yolk  protein  data  to  map  the  exact  region 
which  is  processed  into  Pv,  we  have  chosen  this  "polyserine  domain"  to  represent  a 
hypothetical  Pv  domain.  The  chicken  and  Xenopus  Pv  termini,  which  have  been 
documented  (Clark,  1973;  Gerber-Huber  et  al.,  1987),  lie  to  the  inside  of  our 
boundaries,  verifying  our  convention. 

A  phylogram  was  drawn  to  compare  Vtg  sequences  from  eight  species.  Although 
multiple  isoforms  of  Vtg  have  been  identified  from  several  organisms,  nomenclature 
formally  separating  these  isoforms  into  subfamilies  has  not  yet  been  proposed.  For  our 
tree  analysis  we  selected  only  one  Vtg  sequence  from  each  species.  In  species  which 
contain  multiple  Vtgs,  we  chose  either  the  only  complete  Vtg  available  from  Genbank 
databases,  as  in  chicken  and  Xenopus,  or  the  Vtg  which  is  considered  the  "major"  yolk 
protein  precursor,  as  in  C.  elegans  (Speith  et  al.,  1985).  An  optimal  tree  was  chosen  by 
importing  a  ClustalV  alignment  into  the  program  PAUP  (Swofford,  1983)  and  performing 
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bootstrap  analysis  of  100  replicates  in  a  branch-and-bound  search.  C.  elegans  Vtg  5  was 
defined  as  the  outgroup  and  chicken  Vtg  was  designated  as  the  reference  sequence. 

Accession  codes  of  sequences  used  for  alignments  are  as  follows:  Chicken, 
Gallus  domesticus  Vtg  II,  EMBL:X13607;  Xenopus  laevis  Vtg  A2,  GB:M18061;  silver 
lamprey,  Ichthyomyzon  unicuspis  Vtg,  GB:M88749;  white  sturgeon,  Acipenser 
transmontanus,  partial  Vtg,  GB:U00455;  rainbow  trout,  Oncorhynchus  mykiss  partial 
Vtg  GB:M27651;  tilapia,  Oreochromis  aureus  partial  Vtg,  number  not  available  (Ding 
et  al.,  1990);  boll  weevil,  Anthonomus  grandis  Vtg,  GB:M72980;  nematode, 
Caenorhabditis  elegans  Vtg  5,  EMBL:X56213;  mosquito,  Aedes  aegypti  Vtg, 
GB:U02548;  and  finally,  our  own  mummichog,  Fundulus  heteroclitus  Vtg,  GB:U07055. 

Results 

Cloning 

A  summary  of  our  cloning  strategy  is  presented  in  Figure  2.1  Three  restriction 
products  of  two  MB6-positive  lambda  clones  (#21  and  #5)  were  subcloned  into  PGEM 
3Z  (PMMB1,  PMMB6,  and  PMMB9).  Two  smaller  clones  pGL8  and  pGL5  were 
amplified  by  PCR  directly  from  the  cDNA  library.  The  five  subclones  were  sequenced 
in  both  directions  for  a  final  overlapping  cDNA  sequence  of  5198  bp.  The  overlapping 
cDNA  sequence  contained  an  open  reading  frame  of  5112  bp  and  a  poly- A  tail  of 
undetermined  length  beginning  1 1  nucleotides  after  a  poly-adenylation  site  (AATAAA) 
denoted  by  underlining  in  Figure  2.2. 


Figure  2.2     Translated  amino  acid  sequence  (1 ,704  residues)  of  the  putative  F. 

heteroclitus  Vtg  polypeptide.  Two  separate  signal  peptide 
predictions  are  presented.  The  first  was  obtained  by  an  alignment 
with  other  fish  Vtg  signal  peptides  (Folmar  et  al.,  1995)  and  is 
denoted  by  shaded  lettering.  The  second  prediction  was  obtained 
by  the  computer  analysis  method  of  von  Heijne  (1986)  and  is 
denoted  by  asterisks.  The  nucleotide  stretch  corresponding  to  the 
degenerate  oligonucleotide  MB6,  used  to  screen  the  library,  is 
shown  by  double  underlining  and  bold  letters.  Five  predicted 
antigenic  determinants  are  depicted  by  shaded  lettering  with 
average  hydrophilicity  values  (Ah)  indicated  underneath.  A 
polyadenylation  site  (AATAAA)  is  located  53  nucleotides  past  the 
stop  codon  and  denoted  by  underlining. 
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* 

ATG 

* 

AAA 

* 

GCG 

GTT 

« 

GTG 

M 

X 

A 

V 

V 

GCC 
A 

ccr 
p 

GAA 
i 

F 

GCT 
A 

109       CTG  GCC  GGT  CTT  CCT 
L      G       G       L  P 

163       AAA  CTT  CTA  CTC  AGT 
K      L       L      L  S 

217       CCT  GAG  CTC  TCT  GAG 
P       E       L       S  E 

271       ACC  AAG  TTG  ACA  GCA 
T       X      L      T  A 

325       ACA  CCA  ATG  GTG  TTT 
T       P       M      V  F 

379      GTG  CTG  AAC  ATC  TAC 

Y  L      N      I  Y 

433      ACC  CAC  AAA  GTC  TAT 
T       H       X      V  Y 

487       CTC  TAT  TCC  ATC  AGT 
LYSIS 

S41       AGG  GAC  CTG  AGC  AAC 
R      D       L       S  N 

595       ACT  GAG  AAA  TGC  GAC 
T       E       K       C  D 

649      ACA  TTA  AGT  TAC  GTC 
T       L       S       Y  V 

703       TAC  GTT  AAT  GAG  CTG 

Y  V      N      E  L 

757      CAG  ATG  AGG  ACC  AAG 
Q      M      R      T  K 

811      CCA  TCT  GTC  AAG  GCT 
P       S       V      K  A 

865       GAT  GAA  CTT  CTT  CAG 
D       E       L      L  Q 

919       CAG  GTT  GCA  GAG  GTC 
Q      V      A      E  V 

973       GAA  AAT  GCA  CCT  TTG 
E       N      A       P  L 

1027     TAT  GAA  GAT  TTG  GAA 

Y  E       D       L  E 


CTT  GCC  CTG  ACT  CTG  GCC 

;  L  .     A      .L  _.T  ,.  L  .  .  A 

GCT  GGT  AAG  ACC  TAC  GTA 
A      G      K      T       Y  V 

GAG  GAA  GGT  TTG  GCA  AGA 
E       E      G      L      A  R 

GCA  GCT  GAC  CAA  AAT  ACT 
A       A       □       Q       N  T 

TAC  AGC  GGC  ATT  TGG  CCA 

Y  S       G       I       W  P 

GCC  CTT  CAC  CTC  AGC  TCG 
A      L      H      L       S  S 

GTT  GGT  AAA  GTC  TTT  GCT 

V  G      K      V      F  A 

AGA  GGC  ATC  CTG  AAT  ATT 
R      G       I       L       N  I 

GAC  TTG  CAG  GAG  GTT  GGA 
D       L      Q      E       V  G 

GAA  GAT  GCA  CGA  ATT  GAG 
E      D      A      R       I  E 

TGC  CAG  GAA  AGA  CTC  AAT 
C      Q      E       R      L  N 

AAG  TGC  CAG  GAG  GAA  ACT 
K       C       Q       E       E  T 

TTG  AAA  CCA  GTC  GCC  GAT 
L      X      ?      V      A  D 

ATC  CAG  TTT  TCA  CCT  TTC 
I      Q      F       S       P  F 

CAG  TCT  TTG  GAG  TTC  CTT 
Q      S      L      E       F  L 

GAA  TAT  CGT  CAC  CGT  GGA 
E       Y      R      H      R  G 

ACA  CCC  CTT  CAG  CTG  ATC 
T      P      L      Q.      L  I 

CTG  AAG  CAC  CTG  GCT  ACC 
L      K      H      L      A  T 

AAG  TTT  TTG  GAA  CTG  GTA 
K      F      L       E       L  V 

ATG  TAC  TGG  AAC  CAG  TAC 
M       Y       W       N       Q  Y 


*       *       *  * 

TTC  GTG  GCT  GGA  CAA  AAT  TTT 
?       V       A       G       Q       N  F 

TAT  AAG  TAT  GAA  GCG  CTC  ATC 

Y  K       Y       E       A       L  I 

GCT  GGA  TTG  AAA  ATC  AGC  ACC 
A      G      L      K      I       S  T 

TAT  ATG  CTG  AAG  CTT  GTG  GAA 

Y  M      L      K      L      V  E 

AAG  GAC  CCA  GCA  GTG  CCA  GCA 
X      D       P      A      V      P  A 

CAA  TTC  CCA  TCA  AGT  TTG  AAT 
Q       F       P       S       S       L  N 

CCT  GAG  GAA  GTC  TCG  ACT  TTG 
P      E      E      V      S      T  L 

CTC  CAG  CTG  AAC  ATC  AAG  AAG 
L      Q      L      N       I       X  X 

ACT  CAG  GGT  GTG  TGC  AAG  ACC 
T      Q      G      V      C      X  T 

AAC  ATC  CTT  CTG  ACC  AAG  ACC 
N       I       L      L      T      X  T 

AAG  GAC  ATC  GGG  TTG  GCA  TAC 
X      D       I       G      L      A  Y 

AAA  AAC  TTG  AGA  GGT  ACC  ACA 
X      N      L      R      G      T  T 

GCC  GTC  ATG  ATC  CTG  AAG  GCG 
A      V      M      I       L      X  A 

TCT  GAG  GCT  AAC  GGA  GCT  GCC 
S      E      A      N      G      A  A 

GAA  ATT  GAG  AAA  GAA  CCC  ATT 
E       I       E       X      E       P  I 

TCT  CTC  AAA  TAC  GAG  TTC  TCC 
S       L      X      Y      E       F  S 

AAG  ATC  AGT  GAT  GCA  CCA  GCC 
X       I       S       0      A      P  A 

TAC  AAC  ATT  GAG  GAT  GTT  CAT 

Y  N       I       E       D       V  H 

CAA  CTC  CTC  CGT  ATT  GCC  CGC 
Q      L      L       R       I       A  R 

AAA  AAG  ATG  TCT  CCC  CAC  AGA 
X       X       M       S       P       H  R 


25 


1081     CAC  TGG  TTC  TTG  GAC  ACT  ATT  CCT 
HWFLDTIP 

1135     ATC  AAA  GAG  AAG  TTC  ATG  GCT  GAG 
IKEKFMAE 

1189     TTC  ATT  ACA  GCT  GTG  CAC  ATG  GTG 
FITAVHMV 

1243     GAG  AGC  CTG  GTA  GAC  AGC  GAC  AAA 
ESLVDSDK 

1297     GTC  TTC  CTT  GGA  TAT  GGA  ACA  ATG 
VFLGYGTM 

1351     TGT  CCT  GTT  GAA  CTC  ATA  AAG  CCT 
CPVELIKP 

1405     AAG  AAC  GAG  GAA  GAG  AAC  ATC  ATC 
K      N       B      E      E      N       I  I 
(Ah  •  2.07) 

1459     CAT  CCA  TCT  AGC  TTC  AAG  TCA  CTC 
HPSSFKSL 

1513     GCT  GTA  TCT  CTG  CCA  ATG  ACA  ATC 
AVSLPMTI 

1567     ATT  GCA  AAG  AAG  GAG  TCC  AGA  ATG 
IAKKESRM 

1621     GAC  AAG  GCT  CTC  CAC  CCA  GAG  CTC 
DKALHPEL 

1675     ACA  AGT  CCT  TCT  ATG  GGT  TTG  GTG 
TSPSMGLV 

1729     GAG  AAT  TTG  CAG  GTG  GCC  AGC  TTC 
ENLQVASF 

1783     AGC  CCC  GCA  ACC  ATC  CAT  CCC  GAT 
SPATIHPD 

1837     ATC  TTG  GGT  ACA  AAG  CTG  GAC  AGA 
ILGTKLDR 

1891     GTG  GAC  CTC  TAC  AAC  AGT  TCC  TTG 
VDLYNSSL 

1945     ATC  AAC  GAT  GCT  GCC  ACC  TTT  ATG 
INDAATFM 

1999     TTC  ATC  GCT  GGA  AGT  ACT  GCT  GAA 
FIAGSTAE 

2053     CTG  CAG  GAG  CTG  ATT  CTG  AAA  AAC 
LQELILKN 

2107     ACC  AAA  ATG  AAG  CGA  GTC  ATT  AAG 
TKMKRVIK 

2161     AGC  AAA  CCC  CTA  GCC  TCT  GTC  TAT 
SKPLASVY 


GCC  ACT  GGT  ACC  TTC  GCT  GGT  CTC  AGA  TTC 
ATGTFAGLRF 

GAA  ATA  ACC  ATC  GCT  GAG  GCA  GCT  CAG  GCT 
EITIAEAAQA 

ACT  GCT  GAC  CCT  GAG  GTT  ATC  AAG  CTG  TTT 
TADPEVIKLF 

GTA  GTG  GAA  AAC  CCA  CTT  CTG  CGT  GAG  GTT 
VVENPLLREV 

GTT  AAC  AAA  TAC  TGC  AAT  AAG  ACA  GTT  GAT 
VNKYCNKTVD 

ATT  CAA  CAA  CGA  CTG  TCA  GAC  GCC  ATT  GCA 
IQQRLSDAIA 

CTG  TAC  ATA  AAG  GTT  TTG  GGA  AAT  GCC  GGC 
LYIKVLGNAG 

ACT  AAG  ATC  ATG  CCC  ATC  CAT  GGC  ACT  GCT 
TKIMPIHGTA 

CAT  GTT  GAA  GCC  ATC  ATG  GCT  CTG  AGG  AAC 
HVEAIMALRN 

GTC  CAG  GAA  CTG  GCT  CTC  CAG  CTC  TAC  ATG 
VQELALQLYM 

CGT  ATG  CTG  TCC  TGC  ATT  GTT  CTC  TTC  GAG 
RMLSCIVLFE 

ACA  ACT  GTT  GCC  AAC  TCT  GTG  AAA  ACC  GAG 
TTVANSVKTE 

ACT  TAC  TCT  CAC  ATG  AAG  TCC  CTA  AGC  AGG 
TYSHMKSLSR 

GTT  GCT  GCC  GCA  TGC  AGC  GCC  GCC  ATG  AAG 
VAAACSAAMK 

CTG  AGC  CTG  CGT  TAT  AGC  AAA  GCT  GTA  CAT 
LSLRYSKAVH 

GCG  GTC  GGT  GCT  GCT  GCA  ACT  GCT  TTT  TAC 
AVGAAATAFY 

CCA  AAA  TCC  TTT  GTT  GCA  AAG  ACC  AAA  GGC 
PKSFVAKTKG 

GTC  CTG  GAG  ATT  GGA  GCG  AAT  ATT  GAA  GGA 
VLEIGANIEG 

CCT  GCT  CTC  TCT  GAA  AGT  ACT  GAC  AGG  ATC 
PALSESTDRI 

GCT  CTG  TCA  GAA  TGG  AGA  TCC  TTG  CCC  ACC 
ALSEWRSLPT 

GTT  AAG  TTC  TTT  GGA  CAA  GAG  ATT  GGC  TTT 
VKFFGQEIGF 


Figure  2.2-continued 
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2215     GCT  AAC  ATT  GAC  AAA  CCC  ATG  ATC  GAT  AAG  GCT  GTC  AAG  TTT  GGC  AAG  GAA  TTA 
ANIDKPMIDKAVKFGKEL 

2269     CCC  ATT  CAG  GAA  TAT  GGA  AGA  GAG  GCT  CTC  AAG  GCT  CTG  CTC  CTG  TCT  GGC  ATC 
PIQEYGREALKALLLSGI 

2323     AAC  TTC  CAC  TAC  GCT  AAG  CCA  GTG  CTG  GCT  GCT  GAG  ATG  CGA  CGC  ATT  CTT  CCT 
NFHYAKPVLAAEMRRILP 

2377     ACC  GTC  GCT  GGT  ATT  CCA  ATG  GAA  CTC  AGT  CTG  TAC  AGT  GCT  GCT  GTG  GCT  GCA 
TVAGIPMELSLYSAAVAA 

2431     GCC  TCT  GTT  GAA  ATC  AAG  CCC  AAC  ACG  TCA  CCA  CGT  CTG  TCA  GCG  GAC  TTC  GAC 
ASVEIKPNTSPRLSADFD 

2485     GTA  AAG  ACT  CTG  CTG  GAG  ACA  GAC  GTT  GAG  CTC  AAG  GCT  GAG  ATC  AGA  CCA  ATG 
VKTLLETDVELKAE  IRPM 

2539     GTT  GCC  ATG  GAC  ACA  TAT  GCC  GTT  ATG  GGA  CTT  AAC  ACC  GAC  ATC  TTC  CAG  GCT 
VAMDTYAVMGLNTD       I       F      Q  A 

2593     GCT  TTG  GTA  GCT  CGC  GCT  AAA  CTG  CAC  TCT  GTT  GTG  CCA  GCC  AAA  ATA  GCT  GCA 
ALVARAKLHSVVPAKIAA 

2647     AGA  CTT  AAT  ATC  AAA  GAG  GGT  GAC  TTT  AAG  CTT  GAA  GCT  CTT  CCT  GTT  GAT  GTG 
RLNIKEGDFKLEALPVDV 

2701     CCT  GAA  AAC  ATC  ACA  TCC  ATG  AAT  GTT  ACA  ACC  TTT  GCT  GTA  GCA  AGA  AAC  ATC 
PENITSMNVTTFAVARNI 

2755     GAG  GAA  CCT  TTG  GTT  GAG  AGA  ATC  ACT  CCT  CTT  CTC  CCC  ACC  AAA  GTT  TTG  GTA 
EEPLVERITPLLPTKVLV 

2809     CCC  ATC  CCA  ATC  AGG  AGA  CAC  ACA  TCC  AAG  CTT  GAT  CCC  ACT  CGC  AAT  AGC  ATG 
PIPIRRHTSKLDPTRNSM 

2863     TTA  GAC  TCC  TCA  GAA  CTC  CTT  CCC  ATG  GAA  GAA  GAA  GAT  GTA  GAG  CCC  ATT  CCT 

(Ah  =  2.25) 

2917     GAA  TAC  AAG  TTC  CGT  CGA  TTT  GCC  AAA  AAG  TAC  TGC  GCT  AAG  CAC  ATT  GGT  GTT 
EYKFRRFAKKYCAKHIGV 

2971     GGA  CTG  AAG  GCC  TGT  TTC  AAG  TTT  GCC  AGT  CAA  AAT  GGA  GCC  TCC  ATC  CAA  GAC 
GLKACFKFASQNGASIQD 

3025     ATT  GTC  CTG  TAC  AAA  CTG  GCT  GGT  AGC  CAC  AAC  TTC  TCT  TTC  TCT  GTG  ACA  CCA 
IVLYKLAGSHNFSFSVTP 

3079     ATT  GAA  GGA  GAA  GTT  GTT  GAG  AGA  TTG  GAG  ATG  GAG  GTT  AAA  GTC  GGA  GCA  AAG 
IEGEVVERLEMEVKVGAK 

3133     GCT  GCA  GAG  AAG  CTT  GTT  AAA  CGC  ATC  AAC  CTG  AGT  GAG  GAC  GAA  GAA  ACT  GAA 
AAEKLVKRINLS     ;E:  ?i:D  .  ^:i&0§:&M^T§B!§^ 

3187     GAA  GGA  GGT  CCA  GTC  CTG  GTG  AAG  CTC  AAC  AAA  ATC  CTG  TCT  TCA  AGA  CGG  AAC 
E  GGPVLVKLNKILSSRRN 

3241     AGC  TCC  TCA  TCT  TCC  TCC  TCC  AGC  TCC  AGC  AGC  TCT  TCT  GAG  AGC  CGT  TCT  TCA 
SSSSSSSSSSSSSESRSS 

3295     AGG  TCC  TCC  TCT  TCC  TCC  TCC  TCT  TCA  TCT  CGC  TCC  AGC  CGT  AAG  ATT  GAC  CTT 
RSSSSSSSSSRSSRKIDL 

Figure  2.2~continued 
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3349     GCA  GCC  AGG  ACC  AAT  AGC  AGC  AGC  AGC  AGC  AG7  AGC  CGT  CGC  AGC  AGA  AGC  AGC 
AARTNSSSSSSSRRSRSS 

3403     AGC  AGC  AGC  AGC  AGC  AGC  AGT  AGC  AGT  AGC  AGC  AGC  AGC  AGC  AGC  AGC  AGC  AGC 
SSSSSSSSSSSSSSSSSS 

34=7     AGG  AGA  AGC  AGC  AGC  AGC  AGC  AGT  AGT  AGC  AGC  AGC  AGC  AGC  AGT  AGG  AGC  AGC 
RRSSSSSSSSSSSSSRSS 

3511     AGG  AGA  GTC  AAC  TCA  ACA  AGA  TCC  AGC  AGC  AGT  TCA  AGT  AGG  ACC  AGC  TCT  GCA 
RRVNSTRSSSSSSRTSSA 

3565     TCA  AGC  CTT  GCA  TCT  TTC  TTC  AGT  GAC  AGC  TCA  AGC  TCT  TCT  AGC  TCC  AGT  GAT 
SSLASFFSDSSSSSSSS  D 

3619     CGT  CGC  TCA  AAG  GAA  GTG  ATG  GAG  AAG  TTC  CAG  AGG  TTA  CAC  AAG  AAA  ATG  GTC 
RRS  K.SVMSKFQRLHKKMV 

(Ah  =  2.55) 

3673     GCC  TCC  GGT  AGC  AGT  GCC  TCA  AGC  GTT  GAA  GCC  ATC  TAC  AAA  GAG  AAA  AAA  TAT 
T^SGSSASSVEAIYKEKKY 

3727     CTT  GGC  GAG  GAA  GAA  GCC  GTT  GTG  GCA  GTG  ATT  CTC  CGT  GCT  GTC  AAA  GCT  GAC 
LGEEEAVVAVILRAVKAD 

3781     AAG  AGG  ATG  GTG  GGA  TAC  CAG  CTT  GGT  TTC  TAC  CTT  GAC  AAA  CCA  AAT  GCC  AGA 
KRMVGYQLGFYLDKPNAR 

3835     GTT  CAG  ATC  ATT  GTC  GCC  AAC  ATT  TCT  TCT  GAT  AGC  AAC  TGG  AGG  ATC  TGT  GCT 
V       Q       I        I  VANISSDSNWRICA 

3889     GAT  GCA  GTT  GTG  TTG  AGC  AAG  CAC  AAA  GTT  ACA  ACC  AAG  ATT  TCC  TGG  GGA  GAA 
DAVVLSKHKVTTKISWGE 

3943     CAG  TGC  AGG  AAA  TAC  AGC  ACC  AAT  GTT  ACA  GGA  GAG  ACT  GGT  ATT  GTT  TCT  TCA 
QCRKYSTNVTGETGIVSS 

3997     AGC  CCT  GCC  GCT  CGC  CTC  AGA  GTG  TCC  TGG  GAA  AGA  CTG  CCT  TCT  ACC  CTG  AAA 
SPAARLRVSWERLPSTLK 

4051     CGC  TAT  GGA  AAG  ATG  GTT  AAC  AAG  TAC  GTT  CCT  GTT  AAA  ATA  TTG  TCT  GAC  TTG 
RYGKMVNKYVPVKILSDL 

4105     ATC  CAC  ACA  AAG  AGA  GAA  AAC  AGC  ACC  AGG  AAT  ATC  TCA  GTC  ATT  GCA  GTT  GCC 
IHTKRENSTRNISVIAVA 

4159     ACA  TCT  GAA  AAG  ACA  ATT  GAC  ATC  ATA  ACC  AAA  ACT  CCA  ATG  AGC  TCT  GTC  TAC 
TSEKTIDIITKTPMSSVY 

4213     AAT  GTC  ACT  ATG  CAT  CTT  CCC  ATG  TGT  ATT  CCC  ATT  GAT  GAG  ATC  AAA  GGT  CTC 
NVTMHLPMCIPIDEIKGL 

4267     AGC  CCC  TTT  GAT  GAA  GTC  ATT  GAC  AAG  ATC  CAC  TTC  ATG  GTT  TCT  AAG  GCT  GCT 
SPFDEVIDKIHFMVSKAA 

4321     GCA  GCT  GAA  TGC  AGC  TTC  GTC  GAA  GAC  ACA  CTC  TAC  ACA  TTC  AAC  AAC  AGG  AGC 
AAECSFVEDTLYTFNNRS 

4375     TAC  AAG  AAC  AAG  ATG  CCT  TCC  TCT  TGC  TAC  CAG  GTT  GCA  GCA  CAG  GAC  TGC  ACA 
YKNKMPSSCYQVAAQDCT 

4429     GAT  GAG  CTG  AAA  TTC  ATG  GTT  CTC  CTG  AGG  AAG  GAT  TCG  TCC  GAA  CAA  CAC  CAC 
DELKFMVLLRKDSSBQHH 

(Ah  =  2.1) 


Figure  2.2--continued 
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4483     ATC  AAT  GTC  AAG  ATT  TCT  GAG  ATC 
INVKISEI 

4537     GTC  ACT  GTG  AAG  GTC  AAC  GAA  ATG 
VTVKVNEM 

4591     ACC  CAA  CAG  CTT  CCA  TTG  AAG  ATC 
TQQLPLXI 

4645     GCA  CCC  AGC  CAC  GGT  CTC  CAA  GAA 
APSHGLQE 

4699     AAA  GTT  GCT  GAC  TGG  ATG  AAA  GGA 
KVADWMKG 

4753     GGA  GAG  ATC  AGA  CAG  GAG  TAC  CAC 
GEIRQEYH 

4807     ATC  AGC  TTT  GCT  CAC  TCC  TGG  ATT 
ISFAHSWI 

4861     GAG  TGC  CGT  CTG  AAA  CTT  GAA  TCT 
ECRLKLES 

4915     GGT  GAG  GAC  TCC  ACA  TGC  TTC  TCA 
GEDSTCFS 

4969     TGC  TTG  CCT  GTC  AAG  ACC  ACA  CCT 
CLPVKTTP 

5023     GAT  CCT  CAG  ACC  AGT  GTC  TAT  GAC 
DPQTSVYD 

5077     GCT  CAC  CTG  GCT  TGC  AGC  TGC  AAC 
AHLACSCN 

5131     GAA  GTC  ACT  ACT  ATG  TGT  AAG  TTT 

5185     AAA  TAA  AAA  AAA  AA 


GAT  ATT  GAC  ATG  TTT  CCA  AAG  GAC  GAC  AAC 
DIDMFPKDDN 

GAA  ATA  CCC  CCA  CCA  GCC  TGC  CTT  ACC  GCC 
EIPPPACLTA 

AAG  ACA  AAG  CGG  AGA  GGA  CTT  GCT  GTC  TAT 
KTKRRGLAVY 

GTC  TAC  TTT  GAC  AGG  AAG  ACA  TGG  AGG  ATC 
VYFDRKTWRI 

AAG  ACC  TGT  GGA  CTC  TGT  GGA  AAG  GCT  GAT 
KTCGLCGKAD 

ACT  CCC  AAC  GGA  CGC  GTG  GCC  AAG  AAC  TCG 
TPNGRVAKNS 

CTT  CCT  GCT  GAA  AGC  TGC  AGG  GAT  GCA  TCT 
LPAESCRDAS 

GTG  CAG  CTG  GAG  AAA  CAG  TTG  ACC  ATC  CAC 
VQLEKQLTIH 

GTT  GAG  CCT  GTA  CCT  CGT  TGT  CTG  CCC  GGT 
VEPVPRCLPG 

GTC  ACT  GTT  GGT  TTC  AGC  TGC  CTG  GCA  TCT 
VTVGFSCLAS 

AGA  AGT  GTG  GAT  CTA  AGA  CAA  ACT  ACC  CAG 
RSVDLRQTTQ 

ACC  AAG  TGC  TCT  TAA  ACA  TAA  GAT  TTC  CTT 
T       K       C  S 

TAT  CTG  TAA  CAA  TAA  ATA  AAC  TGC  ATC  TGA 


Figure  2.2-continued 
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F.  heteroclitus  Vtg  Sequence 

A  conceptual  translation  of  the  5112  bp  open  reading  frame  resulted  in  a  1704- 
amino  acid  protein  sequence  (Fig.  2.2).  A  signal  peptide  was  predicted  (underlined)  by 
aligning  the  F.  heteroclitus  sequence  with  the  N-terminal  sequences  of  several  other 
piscine  Vtgs  (Folmar  et  al.  in  press).  This  prediction  can  be  compared  to  that  resulting 
from  the  method  of  von  Heijne  (1986),  represented  in  Figure  2.2  by  asterisks.  We  made 
several  attempts  to  determine  the  signal  peptide  sequence  through  N-terminal  sequencing 
of  Vtg  isolated  from  the  blood  of  estrogen-treated  male  F.  heteroclitus,  all  of  which 
resulted  in  inconclusive  residue  readings,  suggesting  that  the  secreted  Fundulus  Vtg  is 
N-terminally  blocked.  Five  internal  peptide  sequences  predicted  to  offer  high  antigenicity 
by  the  method  of  Hopp  and  Woods  (1981)  are  represented  by  shaded  lettering  in  Figure 
2.2.  The  end  of  the  cDNA  sequence  was  revealed  by  a  poly-adenylation  site 
(AATAAA),  beginning  at  bp  5165  and  denoted  by  underlining. 

A  scan  of  the  sequence  for  post-translational  modification  sites  of  the  putative 
protein  revealed  16  potential  N-glycosylation  sites,  13  potential  N-myristoylation  sites, 
and  potential  phosphorylation  sites  for  the  following  kinases:  7  for  CAMP-  and  CGMP- 
dependent  protein  kinase;  39  for  protein  kinase  C;  23  for  casein  kinase  II;  and  finally, 
a  single  site  for  tyrosine  kinase  (Fig.  2.3).  We  have  highlighted  the  polyserine  domain 
in  Figure  2.3  with  asterisks.  The  asterisks  signify  that,  in  addition  to  the  predicted 
phosphorylation  sites  for  the  above  mentioned  kinases,  past  studies  in  F.  heteroclitus 
(Wallace  and  Begovac,  1985)  and  in  other  non-mammalian  vertebrates  (Mecham  and 
Olcott,  1949,  Mano  and  Lipmann,  1966,  Wiley  and  Wallace,  1981;  Byrne  et  al.,  1984) 
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Fundulus  heteroclitus  Vitellogenin 


PREDICTED... 
f    N-glycosylation  site 
^  N-myristoylation  site 
^   phosphorylation  site 


Figure  2.3  A  schematic  representation  of  potential  sites  for  posttranslational 
modifications  of  the  putative  F.  heteroclitus  Vtg  protein  as  predicted  by 
the  Prosite  program  (Bairoch  et  al.,  1995).  Phosphorylation  sites 
represent  potential  targets  for  the  following  kinases:  c-AMP-  and  g-AMP- 
dependent  kinase,  protein  kinase  C,  casein  kinase  II,  and  tyrosine  kinase. 
The  region  denoted  by  asterisks  represents  the  polyserine  domain.  Past 
studies  suggest  that  in  addition  to  the  sites  displayed  by  the  above- 
mentioned  kinases,  every  serine  residue  in  this  region  undergoes 
phosphorylation  by  an  as-yet  unidentified  "vitellogenin  kinase." 


31 

suggest  that  almost  all  serine  residues  within  the  phosvitin  region  are  phosphorylated  by 
an  as  yet  uncharacterized  "vitellogenin  kinase"  activity. 

Protein  Alignments 

Alignment  of  the  F.  heteroclitus  Vtg  sequence  with  other  selected  vertebrate  Vtgs 
is  shown  in  Figure  2.4.  Partial  Vtg  cDNA  translations  published  from  three  other  fish 
species  are  included.  Pairwise  comparisons  of  these  vertebrate  Vtg  sequences  against  the 
F.  heteroclitus  sequence  result  in  similar  degrees  of  identity:  Gallus,  38%;  Xenopus, 
39%;  Acipenser,  38%;  and  Ichthyomyzon,  37%.  Against  the  two  smaller  teleost 
sequences,  the  F.  heteroclitus  sequence  shares  50%  identity  with  rainbow  trout, 
Oncorhynchus  but  only  30%  with  Oreochromis.  These  last  two  values  should  be 
considered  only  preliminary  until  more  sequence  information  becomes  available. 
Attempting  to  find  an  obvious  difference  between  the  F.  heteroclitus  Vtg  and  that  of  the 
other  vertebrates,  we  compared  several  types  of  predicted  structural  analysis  scales 
including  those  by  the  methods  of  Hopp  and  Woods  (1981),  Kyte  and  Doolittle  (1982), 
and  Janin  (1979).  There  were  no  striking  differences  revealed  by  these  methods  that 
might  account  for  the  greater  solubility  of  the  F.  heteroclitus  yolk  proteins  (data  not 
shown). 

The  phylogram  in  Figure  2.5  was  created  using  the  program  PAUP  (Swofford, 
1993)  from  an  alignment  (not  shown)  containing  the  first  five  vertebrate  Vtgs  listed  in 
Figure  2.4,  plus  three  invertebrate  Vtgs  from  boll  weevil,  Anthonomus  grandis, 
mosquito,  Aedes  aegypti,  and  finally  Vtg  5  from  C.  elegans,  defined  as  an  outgroup.  In 


Figure  2.4  Alignment  of  the  putative  F.  heteroclitus  Vtg  sequence  (gi: 459202) 
with  other  vertebrate  Vtgs:  the  chicken  Gallus  domesticus  Vtg  II 
(van  het  Schip  et  al.,  1987);  Xenopus  laevis  Vtg  A2  (Gerber-Huber 
et  al.,  1987);  the  white  sturgeon  Acipenser  transmontanus  Vtg 
(Bidwell  and  Carlson,  1995);  the  silver  lamprey  lchthyomyzon 
unicuspis  Vtg  (Sharrock  et  al.,  1992);  and  the  C-termini  from  the 
rainbow  trout  Oncorhynchus  mykiss  Vtg  (LeGuellec  et  al.,  1988) 
and  the  tilapia  Oreochromis  aureus  (Ding  et  al.,  1990)  Vtg  as 
constructed  by  ClustalV  (Higgins  et  al.,  1992)  and  modified  by 
eye.  Our  defined  poly  serine  domain,  which  includes  putative  Pv 
regions,  is  labeled  and  underscored  with  a  triple  dashed  line. 
Residues  identical  in  at  least  four  of  the  aligned  sequences  are 
denoted  by  shaded  lettering.  Sequence  gaps  are  represented  as 
dashes. 
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Fundulus  MKAWL -  A&TLAFVAGQ -  - NF APEEAAG KTYVYKYEAL I  LGGI.PE EGIiARAGLK I STKLL  57 

Callus  MRGIIL-WiVLTLVGSQKFDIDPGgNSRRSYLYNYEGSMLNGLQDRSLGKAGVRLSSkLE  59 

Xenopus  Nf  KG  I  VL  -  ALLLALAGS  ERTH I EPVFS  ESK I S VYNYEAV I LNGFPESGLSRAG I K I NCKVE  59 

Aapenser  -  -  -LTIALVGSQQTKYEPSFSGSKTYQYKYEGVILTGLPEKGLlARAGLKVHCKVE  52 

Ichthyomyzon   MWKLLLVALAFALADAQ              - FQPGKVYRYSYDAFS I SGLPEPGVNRAGLSGEMKI E  53 

Fundulus  L#AADQNTYMI,KLVEPELSEYSGIWPKDPAVPATKLTAALHLSSQFPSSLNTPMVFVGKV  117 

Gallus  ISGLPENAYLLKVRSPQVEEYNGVWPRDPFTRSSKITQVISSCFTRLFKFEYSSGRIGNI  119 

Xenopus  JSAYAQRSYFLKIQSPEIKEYNGVWPKDPFiTRSSKLTQALAEQLTKPARFEYSNGRVGDE  119 

Aapenser  ISEVAQKTYLLKILNPEIQEYNGIWPKAPFYPASKLTQALASQLTQPIKFQVRNGQVGDr 

Ichthyomyzon  IHGHTHNQATLKITQVNLKYFLGPWPSDSFYPLTAGYDHFIQQLEVPVRFDYSAGRIGDI  113 


177 
179 


Fundulus  FAPEEVSTLVLNIYRGILNILQLNIKKTHKVYDLQEVGTQGVCKTLYSISEDARIENILL 
Gallus  YAPEDCPDLCVNIVRGILNMFQMTIKKSQNVYELQEAGIGGICHARYVIQEDRKNSRIYV 
Xenopus    FVADDVSDTVANIYRGILNLLQVTIKKSQDVYDLQESSVGGICHTRYVIQEDKRGDQIRI  179 
Aapenser  FASEDVSDTVLNIQRGILNMLQLTIKTTQNVYGLQENGIAGICEASYVIQEDRKANKIIV  172 
Ichthyomyzon  YAPPQVTDTAVNIVRGILNLFQLSLKKNQQTFELQETGVEGICQTTYWQEGYRTNEMAV 


173 
237 


Fundulus  TKTRDLSNCQERLNKDIGLAYTEKCDKCQEETKNLRGTTTLSYVI/KPVADAVMILKAYVN 

Gallus  TRTVDLNNGQEKVQKSIGMAYIYPCPVDVMKERLTKGTTAFSYKLKQSDSGTLITDVSSR  239 

Xenopus  IKSTDFNNCQDKVSKTIGLELAEFCHSeKQLNRVIQGAATYTYKI,KGRDQGTVIMEVTAR  23  9 

Aapenser  TKS KDLNNCNEKI KMD IGMAYSHTGSNGRK I RKNTRGTAAYTY IIjKPTDTGTLITQATSO  23  2 

Ichthyomyzon  VKTKDLNNCDHKVYKTMGTAYAERCPTCQKMNKNLRSTAVYNYA I FDEPSGYI IKSAHSH  233 

Fundulus  ELigFSPFSEANGAAQMRTKQSIiEFLEIEKEPIPSVKAEYRHRGSIiKYEFSDELLQTPLQ  297 

Gallus  QVYQISPFNEPTGVAVMEARQQCTLVEVRSERGSAPDVPMQNYGSLRYRFPAVLPQMPLQ  2 9 9 

Xenopus  QVLQVTPFAERHGAATMESRQVIAWVGSKSGQLTPPQIQLKNRGNLHYQFASELHQMPIH  299 

^penser  EVHQLTPFNEMTGAAITEARQKLVLEDAKVIHVTVPEQELKNRGSIQYQFASEILQTPIQ  29^ 

Ichthyomyzon  EIQQLSVEDIKEGNWIESRQKLILEGIQSAP.^SQAASLQNRGGEMYKFPSSAITKMSS 


Fundulus   |I  -  -KISDAPAQVAEVLKHIiATYKIEDVHENAPLKFLELVQLLRIARYEDtEMYWNQYKK 
Gallus   EI  -  -KTKNPBQRIVETLQHIVLNNQQDFHDDVSYRFLEWQLCRIANADNtiESIWRQVSD 
Xenopus    LM- -KTKSPEAQAVEVLQHLVQDTQQHIREDAPAKFMLVQLLRASNFENtQALWKQFAQ 
Jcipenser  Lt  -  -KTRSPETKIKEVLQHLVQNNQQQVQSDAPSKFLQLTQLLRACTHENIEGIWRQYEK 
ichthyomyzon    LFVTKGKNLESEIHTVLKHLVENKQLSVHEDAPAKFLRLTAFLRNVDAGVLQS IWHKLHQ 

Fundulus  MSPHRHWFI43TIPATGTFAGCREIKEKFMAEEITIAEAAQAFITAVHMVTADPEVIKLFE  415 

Gallus  KPRYRRWLLSAVSASGTTETtKFLKNRIRNDD£NYIQTLLTVSLTLHLLQADEHTLPIAA  417 

Xenopus  RTQYRRCLLDALPMAGTVDCEKFIKQLIHNEELTTQEAAVLITFAMRSARPGQRNFQISA  417 

Aapenser  TQLYRRWILDALPAAATPTAFRPITQRIMKRDLTDAEAIQTLVTAMHLVQtSivQMAA  til 

Ichthyomyzon  QKDYRRWIIJDAVPAMATSEAT.T,PT.KRTT.aQpnT.Taac.A^T,^o^T  0^^"^  ~^??  41? 


293 

355 
357 
357 
350 
353 


PRESLSYAR  413 

Fundulus    SLVDSDKWENPLLRE'/VFLGYGT,vr/NKYCNKTVDCPVELIKPIQQRLSDAIAKNEEENI  4  7  5 

Gallus   DLMTSSRIQKNPVLQQVACLGYSSVVNRYCSQTSACPKKAE^  \H 

Xenopus    DLVQDS  KVQKYS  TVH  KAA I  liAYGTMVRR  YCDQLS  S  G  P  EHALEELHELAAEAANKGH  YEDI  477 

^penser  ELVFDRANLKCPVLRKHAVIAYGSMVNRYCAETLNCREEALKPLHDFANDAISRAHE^  470 

ichthyomyzon    ELI^TSFIRNRPILRKTAVM^^  J™ 
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Fundulus  ILYIKVLGNAGHPSSFKSLTKIMPIHGTAAVSLPMTIHVEAIMALRNIAKKESRMVQELA  535 

Cuilus  KLALKCIGNMGEPASLKRILKFLPISSSSAADIPVHIQIDAITALKKIAWKDPKTVQGYL  537 

Xenopus  ALALKALGNAGQPESIKRIQKFLPGFSSSADQLPVRIQTDAVMALRNIAKEDPRKVQEIL  537 

Acipenser  VLALKALGNAGQPSSIKRIQKCLPGFSSGASQLPVKIQVDAVMALRNIAKKEPGKVQELT  53  0 

Ichthyomyzon  VLALKALGNAGQPNS I KK  IQRFIiPGQGKS  LDEYSTRVQAEA I MALRNI AKRDPRKVQE I V  533 


Fundulus  LQLYMDKALHPELRMLSCIVLFETSPSMGLVTTVANSVKTEE-  -NLQVASFTYSHMKSLS  593 

Gallus  IQILADQSLPPEVRMMACAVIFETRPALALITTIANVAMKESKTNMQVASFVYSHMKSLS  597 

Xenopus  LQIFMDRDVRTEVRMMACLALFETRPGLATVTAIANVAARESKTNLQLASFTFSQMKALS  5  97 

Acipenser  MQLFMDHQLHSEVRMVASMVLLETRPSMALVATLAEALLKE  -  -TSLQVASFTYSHMKAIT  58  8 

Ichihyoinyzon  LPIFLNVAIKSELRIRSCIVFFESKPSVALVSMVAVRLRREP-  -NLQVASFVYSQMRSLS  591 

Fundulus  RSPATIHPDVAAACSAAMKILGTKLDRLSLRYSKAVHVDLYNSSLAVGAAATAFYINDAA  653 

Gallus  KSRLP  FMYN I SS  ACN I ALKLLS  PKLDSMS YRYS  KV I RADTYFDNYRVGATGE I FWNS  PR  657 

Xenopus  KSSVPHLEPLAAACCVALKILNPSLDNLGYRYSKVMRVDTFKYNLMAGAAAKVFIMNSAN  657 

Acipenser  RSTAPENHALSSACNVAVKLLSRKLDRLSYRYSKAMHMDTFKYPLMAGAAANIHI INNAA  648 

Ichthyomyzon  RSSNPEFRDVAAACSVAIKMLGSKLDRLGCRYSKAVHVDTFNARTMAGVSADYFRINSPS  651 

Fundulus  TFMPKS  FVAKTKGF I AGSTAEVLE IGAN I EGLQEL I LKNPALSESTDR   701 

Gallus  TMFPSAI ISKLMANSAGSVADLVEVGIRVEGLADVIMKRNI PFAEYPT -    705 

Xenopus  TMFPVFILAKFREYTSLVENDDIEIGIRGEGIEEFLRKQNIQFANFPM-    705 

Acipenser  S I LPSAWMKFQAY I LSATADPLE IGLHTEGLQEVLMQNHEH I DQMPS  -  696 

Ichthyomyzon  GPLPRAVAAKIRGQGMGYASDIVEFGLRAEGLQELLYRGSQEQDAYGTALDRQTLLRSGQ  711 

Fundulus   ITKMKRVIKALSEWRSLPTSKPLASVYVKFFGQEIGFANIDKPMIDKAVKFGKELP 

Gallus   YKOIKELGKALQGWKELPTETPLVSAYLKILGQEVAFININKELLQQVMKTWEPA  761 

Xenopus   RKKISQIVKSLLGFKGLPSQVPLISGYIKLFGQEIAFTELNKEVIQNTIQALNQPA  761 

Acipenser  AGKIQQIMKMLSGWKSVPSEKTLASAYIKLFGQEISFSRLDKKTIQEALQAVREPV  752 

Ichthyomyzon  ARSHVSSIHDTLRKLSDWKSVPEERPLASGYVKVHGQEWFAELDKKMMQRISQLWHSAR  771 

Fundulus  IQEYG  REALKALELSGINFHYAKPVLAAEMRRILPTVAGIPMELSLiYSAAVAAASV  813 

Gallus  DRNAA  IKRIANQILNSIAGQWTQPVWMGELRYWPSCLGLPLEYGSYTTALARAAV  817 

Xenopus  ERHTM  IRNVLNKLLNGWGQYARRWMTWEYRHIIPTTVGLPAELSLYQSAIVHAAV  817 

Acipenser  ERQTV  IKRWNQLERGAAAQLSKPLLVAEVRRILPTCIGLPMEMSLYVSAVTTADI  808 

Ichthyomyzon  SHHAAAQ  EQ I  RAWS  KLEQGMDVLLTKG  YWS  E  VRYMQPVC I G I PMDLNLLVSGVTTNRA  831 

Fundulus  E I KPNTS  PRLSADFDVKTtliETDVEIjKAE  I RPMVAMDTYAVMGLNTD I FQAALVARAKLH  87  3 

Gallus  SVEGKMTPPLTGDFRLSQLLESTMQIRSDLKPSLYVHTVATMGVNTEYFQHAVEIQGEVQ  877 

Xenopus  NSDVKVKPTPSGDESAAQLLESQIQLNGEVKPSVLVHTVATMGINSPLFQAGIEFHGKVH  877 

Acipenser  NVQAH I  TPS  PTNDFNVAQLLNSN I  VtHTDVTPS  I AMHT I AVMG I NTHV I QTGVELHVKAR  868 

Ichthyomyzon  NLHAS  FSQSLPADMKLADLLATN I  EliRVAATTSMSQHAVAI  MGLTTDLAKAGMQTHYKTS  8  91 

Fundulus  SWPAKIAARLNIKEGDFKLEALEVDVPENITSMNVTTFAVARNIEEPLVERITPLIjPTK  93  3 

Gallus  TRMPMKFDAKIDVKLKNLKIETNPCREETEIVVGRHKAFAVSRNIGELGVEKRTSIEiPED  937 

Xenopus  AHLPAKFTAFLDMKDRNFKIETPPFQQENHLVE IRAQTFAFTRNIADLDSARKTLVVPRN  9  3  7 

Acipenser  TTVPMKFTAKIDLKEKNFKIESEPCQQETEVLSLSAQAFAISRNVEDLDAAKKNPLLPEE  928 

Ichthyomyzon  AGLGVNGKIEMNARESNFKASLKPFQQKTWVLSTMESIVFVR-  -  -DPSGSRILPVItPPK  94  8 

Fundulus  VLVP -  -I  PI RRHTSKLDPTRNSMLDSSEL-  - LPMEEEDVEPI PEYKF  RRFA  980 

Gallus  APLD-  -  VTEEPFQTSERASREH  FAMQGPDS  - -MPRKQSHSSREDLRRSTGKRAHK  988 

Xenopus  NEQN-  -  I LKKHFETTGRTSAE  GASMMEDSSEM-  -GPKKYSAEPGHHQYAPN  INS  987 

Acipenser  AVRN-  -  ILNEQFNSGTEDSNERERAGKFARPSAEM-  -MSQELMNSGEHQNRKGA  HAT  981 

Ichthyomyzon  MTLDKGLISQQQQQPHHQQQPHQHGQDQARAAYQRPWASHEFSPAEQKQIHDIMTARPVM  1008 


Figure  2.4-continued 
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Fundulus   KK-  -  YCAKHIGVGLKACFKFASQNGASIQDIVLYKLAGSHNFSFSVTPIEGE-  -WERLE  1036 

Gallus  RD-  -  ICLKMHHIGCQLCFSRRSRDASFIQNTYLHKLIGEHEAKIVLMPVHT-DADIDKIQ  1045 

Xenopus   YD- - ACTKFSKAGVHLCIQCKTHNAASRRNTIFYQAVGEHDFKLTMKPAHT- EGAIEKLQ  1044 

Acipenser  RS-  - ACAKAKNFGFEVCFEGKSENVAFLRDSPLYKI IGQHHCKIALKPSHSSEATIEKIQ  1039 

lchthyomyzon   RRKQHCSKSAALSSKVCFSARLRUAAFIRNALLYKITGDYVSKVYVQPT-SSKAQIQKVE  1067 

Fundulus  MEVKVGAKAAEKLVKRINLSEDEETEEG-  -GPVLVKLNKIIi     1075 

Gallus  LEIQAGSRAAARI ITEVNPESEEEDESSPYEDIQAKLKRILGIDSMFKVANKTRHPKNRP  1105 

Xenopus  LE I TAGPKAASK I MGLVEVEGTEGE PMDE - TAVTKRLKM I LG I DESRKDTNETALYRSKQ  1103 

Acipenser  LELQTGNKAASKI IRWAMQSLAEADEMK-GNILKKLNKLLTVDGE   1084 

lchthyomyzon   LELQAGPQAAEKVI RMVELVAKAS KKSKKNST ITEEGVGETI I SQLKK I LSSDKDK   112  3 

>===POLYSERINE  DOMAINss 

Fundulus    1075 

Gallus  SKKGNTVLAEFGTEPDAKTSSSSSSASSTATSSASSSASSPNRKKPMDEEENDQVKQARN  116  5 

Xenopus  KKKNKI  HNRRLDAE  WEARK  1123 

Acipenser  T  1085 

lchthyomyzon   DAKKPPGSSSSSSSSSSSSSSSSSSDKSGKKTPRQGSTVNLAAKR  1168 

Fundulus   SSRRNSSSSSSSSSSSSSESRSSRSSSSSSSSSRSSRKIDLAARTNSSSSSS  1127 

Gallus  KDASSSSRSSKSSNSSKRSSSKSSNSSKRSSSSSSSSSSSSRSSSSSSSS  SSNSK  1220 

Xenopus  QQSSLSSSSSSSSSSSSSSSSSSSSSSSSSPSSSSSSSYSKRSKRREHNPHHQRESSS-S  1182 

Acipenser  QDSTLRGFKRRSSSSSSS3SSS3SSSSSSSSSSSQQSRMEKRMEQDKLTENLERDRDHMR  1145 

lchthyomyzon  ASKKQRGKDSSSSSSSSSSSSDS'SKSPHK-  -HGGAKRQHAGHGAPHLGPQSHSSSSSSSS  1226 
========-=============POLYS  ERINE  DOMAIN==s================== 

Fundulus  SRRSRSS  SSSSSSSSSSSSSSSSSSRRSSSSSSSSSSSSSRSSRR  1172 

Gallus  SSSSSSKSSSSSSRSRSSSKSSSSSSSSSSSSSSKSSSSRSSSSSSKSSSHHSHSHHSGH  1280 

Xenopus  SSQEQNKKRNLQENRKHGQKGMSSS3SSSSSSSSSSSSSSSSSSSSSSSSEENRPHKNRQ  124  2 

Acipenser  GKQSKNKKQEWKNKQKKHHKQLPSSSSSSSSSSSGSNSSSSSSSSSSSSS  RSHNHRN  1202 

lchthyomyzon  SSSSSSASKSFSTVKPPMTRKPRPARSSSSSSSSDSSSSSSSSSSSSSSSSSSSS   12  81 

=========s======e==s=sPOLYSERINE  DOMAIN====ee==ee==s==ss======= 

Fundulus  VNSTRSSSSSSRTSSASSLASFFSDSSSSSSSSD-  -  RRSKEVME - KFQRLHK - K  1222 

Gallus  LNGSSSSSSSSRSVSHHSHEHHSGHLEDDSSSSSSSSVLSKIWGRHEI YQYRFRSAHR-Q  13  3  9 

Xenopus   -  - -HDNKQAKMQSNQHQQKKNKFSESSSSSSSSSSSEMWNKKKHHRNFYDLNFRRTAR-T  1298 

Acipenser  NTRTLSK  SKRYQNNNNSSSSSGSSSSSEEIQKNPEIFAYRFRSHRD-K  1249 

lchthyomyzon   SSSESKSLEWLAVKDVNQSAFYNFKYVPQRKPQ  1314 

Fundulus  MV  ASGSSASSVEAI YKEK  KYLGEEEA- WAVILRAVKADKRMV  1264 

Gallus  EFPKRKLPGDRATSRYSSTRSSHDTSRAASWPKFLGDIKTFVLAAFLHGISNNKKTG  13  96 

Xenopus  KGTEHRGSRLSSSSESSSSSSESAY  RHKA  KFLGDKEPPVLWTFKAVRNDNTKQ  1352 

Acipenser  LGFQNKRGRMSSSSSSSSSSSSQSTLNSKQDA  KFLGDSSPPIFAFVARAVRSDGLQQ  13  06 

lchthyomyzon  TSRRHTPASSSSSSSSSSSSSSSSSSSDSDMTVSAESFEKHSKPKWIVLRAVRADGKQQ  1374 
============POLYSERINE  DOMAINss===s===s=< 

Fundulus  GYQLGFYLD  KPNARVQ 1 1  VAN  I SSDSNWR I GADAWLS  KHKVTTK I S  WGEQCRKYST  13  21 

Gallus  GLQLWYAD  -  TDSVRPRVQVFVTNLTDSS KWKLCADASVRNAPQAVAYVKWGWDCRDYKV  14  55 

Xenopus  GYQMWYQE  -  YHSSKQQIQAYVMDI  - SKTRWAACFDAVWNPHEAQASLKWGQNCQDYKI  1410 

Acipenser  GYQVAAYTD-NRVSRPRVQLLATEI IEKSRWQICADAILASNYKAMALMRWGEECQDYKV  1365 

lchthyomyzon  GLQTTLYYGLTSNGLPKAKIVAVELSDLSVWKLCAKFRLSAHMKAKAAIGWGKNCQQYRA  14  34 

Oncorhynchus                 LGRPKTTSDEPNI ITAALDENDNWKLCADGVLLSKHKVNAKIAWGAGCKDYNT  53 


Figure  2.4-continued 
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Fundulus  NVTGETGIVSSSPAARLRVSWERLPSTLKRYGK-MVNKYVP-VKILSDLIHTKRENSTRN  13  79 

Guilus  STELVTGRFAGHPAAQVKLEWPKVFSNVRSVVE  -  WFYEFVPGAAFMLGFSERMDKNPSRQ  1514 

Xenopus  NMKAETGNFGNQPALRVTANWPKIPSKWKSTGK- WGEYVPGAMYMMGFQGEYKRNSQRQ  14  6  9 

Acipenser  AVSAVTGRLASHPS LQ I KAKWSR I PRAAKQTQN -  ILAEYVPGAAFMLGFSQKEQRNPSKQ  14  24 

khlhyomyzon  MLEASTGNLQSHPAARVD I KWGRLPSSLQRAKNALLENKAPVI ASKLEME IMPKKNQKHQ  14  94 

Oncorhyncus  FITAETGLVGPSPAVRLLDKLPKVPKAVWRYVRIVSEFIPGHIPYY1ADLVPMQKDKNSE  113 


Fundulus    ISVIAVATSEKTIDI ITKTPMSSVYNVTMHLPMCIPIDE-  - 1  KGLSP- -  FDEVIDKI  14  32 

Callus  ARMWALTSPRTCOVWKLPDIILYQKAVRLPLSLPVGP-  -RIPASELQPPIW-NVFAEA  1571 

Xenopus    VKLVFALSS PRTCDW I R I PRLTVYYRALRLPVP I PVGH -  - HAKENVLQTPTW - N I FAEA  1526 

Acipenser   FK 1 1 LAVTSPNT I DTL IKAPKI TLFKQAVQ I PVQ IPMEP-  - SDAER -  - RS PGLAS IMNEI  1480 

khlhyomyzon    VS V I LAAMTPRRMN I IVKLPKVTYFQQGILLPFTFPSPRFWDRPEGSQSDSLPAQIASAF  1554 

Oncorhynchus    KQFTWATSERTLDVILKTPKMTLTKTGVNLPCSLPFESMTDLSPFDDNIVNKIHYL-  -F  171 

Fundulus    HFMVSKAAAAECSFVEDTI.YTFMMRSYKNKMPSSCYQVAAQDCTDELKFMVLLRK- -DSS  14  90 

Gallus   PSAVLENLKARCSVSYNKIKTFNEVKFNYSMPANCYHILVQDCSSELKFLVMMKSAGEAT  1631 

Xenopus    PKLIMDSIQGECKVAQDQITTFNGVDLASALPENCYNVLAQDCSPEMKFMVLMRNSKESP  1586 

Acipenser   PFLIEEATKSKGVAQENKFITFDGVKFSYQMPGGCYHILAQDCRSKVRFMVMLKQASMSK  1540 

khlhyomyzon    SGIVQDPVASACELNEQSLTTFNGAFFNYDMPESCYHVLAQECSSRPPFIVLIKLDSERR  1614 

Oncorhynchus   S  EVNAVKCSMVRDTLTTFIJNKKYKINMPLSCYQVLAQDCTTELKFMVSAEEGSVHL  227 

Fundulus    EQHHINVKISEIDIDMF-PKDDNVTVKVNEMSIPPPA-CLTATQQLPLKIKTKRRGLAVY  154  8 

Gallus   NLKAINIKIGSHEIDM-HPVNGQVKLLVDGAESPTANISLIS -AGASLWIHNENQGFALA  1689 

Xenopus   NHKDINVKLGEYD^PMYYSA-DAFKMKI^fNLEVSEEHLPYKSFNYPTVE|KKKGNGVSLS  164  5 

Acipenser  NLRAVNAKIYNKDIDILPTTKGSVRLLINNNEIPLSQLPFTD-SSGNIHIKRADEGVSVS  15  99 

khlhyomyzon    I  -  -SLELQLDDKKVKIVSRND  IRVDGEKVBLRRLSQKN  QYGFLVLDAGVULL  1664 

Oncorhynchus   NKTTSNVKISDIDVDLYTQDHGVIVKVNEMEVSNEQLPYKDPSG-  S IKIDRKKGEGVSLY  2  8  6 

Fundulus   A  P  S  HGLQ  EVYFDR  KTWR I  KVADWMKGKTCGtCGKADGEI  RQEYHTPNGRVAKNS I SFAHS  1608 

Gallus  APGHGIDKLYFDGKTITIQVPLWMAGKTCGICGKYDAECEQEYRMPNGYLAKNAVSFGHS  174  9 

Xenopus    ASEYGIDSLDYDGLTFKFRPTIWMKGKTCGI CGHNDDESEKELQMPDGSVAKDQMRFIHS  1705 

Acipenser  AQQYdfliESLYFDGKTVQVKVTSEMRGKTCGLCGHNDGERRKEFRMPDGRQARGP-  -  -  -  -  -  1653 

khlhyomyzon    LKYKDL  -  RVSFNSSS  VQVWVPSSLKGQTCGLCGRNDDELVTEMRMPNLEVAKDFTSFAHS  1723 

Oncorhynchus   APSHGLQKVYFDKYSWKIKWDWMKGQTCGLCGKADGENRQEYRTPSGRLTKSSVSFAHS  34  6 

Oreochromis                                                                                                        FFFSLVFHAVS  11 

Fundulus   WILPAESCRDASECRLKLESVQLEKQLTIHGEDSTeFSVEPVPRGLPGGLPVKTTPVTVG  1668 

Gallus   WILEEAPCRGA-  -CKLHRSFVKLEKTVQLAGVDSKCYSTEPVLRCAKGCSATKTTPVTVG  1807 

Xenopus    WILPAESCSEG- - GNLKHTLVKLEKAIATDGAKAKCYSVQPVLRCAKGCSPVKTVEVSTG  1763 

Acipenser   -----   ....  .  SVSPTPG  1660 

khlhyomyzon    WIABDETCGGACALSRQ-  -TVHKESTSVISGSRENCYSTEEIMRCPATCSASRSVPVSVA  1781 

Oncorhynchus   WVLPSDRG- DASEG-LM-  - KLEKQVIVDD- RESK- CYSVEPVLRCLPGCSPVRTTPITIG  400 

Oreochromis   KKLQNHYSLRLLKEKVKS  ELMVPILKVSEPNATLLSPCCSACPACIPVRTTTVNVG  6  7 

Fundulus    -FSCtASDPQ  TSVYD  -  RSVDLRQTTQAHLACSCNTK  -  GS  -  1704 

Gallus    -  FHCLPADS ANSLTDKQ - MKYDQKS EDMQDTVDAHTTCSGENEEGST  1852 

Xenopus    -  FHGLPSDVSLDLPEGQ  -  IRLE- KSEDFSEKVEAHTACSCETSPCAA  1807 

Acipenser    -  - LCLEKTATEAASFCVIM  1677 

khlhyomyzon    -MHCLPAESEAI  SLAMS  EGRPFSLSGKSEDLVTEMEAHVSCVA  1823 

Oncorhynchus    - -HCLPFDSNLNRSEGLSSIY-EKSVDLMEKAEAHVACRCSEQ-CM  442 

Oreochromis   FYGCIiPSDTT- VDRSGLSSFF-EKSIDLRDTAEAHLACRCTPQ-CA  110 


Figure  2.4-continued 
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the  single  best  tree  F.  heteroclitus  Vtg  was  placed  on  an  independent  branch, 
intermediate  to  the  positions  of  sturgeon  and  lamprey  Vtgs.  The  Ichthyomyzon  sequence 
was  the  vertebrate  Vtg  determined  to  lie  furthest  from  the  reference  sequence,  thereby 
placing  it  nearest  to  the  outgroup.  One  of  the  more  significant  relationships  provided  by 
the  tree  is  indicated  by  the  bootstrapping  values  at  the  Acipenser  branch  (in  parentheses, 
Fig.  2.5):  through  100  bootstrap  replicates,  sturgeon  Vtg  was  partitioned  with  the  two 
tretrapod  Vtgs  95%  of  the  time,  substantially  more  than  the  Vtgs  of  either  F.  heteroclitus 
(67%)  or  Ichthyomyzon  (31%  not  shown). 

Polyserine  Domain 

We  have  designated  a  polyserine  domain  from  each  of  the  aligned  Vtgs 
(underscored  with  a  triple  dotted  line  in  Fig.  2.4;  see  Materials  and  Methods)  and 
compared  them  in  regard  to  size,  relative  serine  composition  and  serine  codon  usage 
(Fig.  2.6).  Of  the  Vtgs  listed  here,  F.  heteroclitus  Vtg  contains  the  smallest  polyserine 
domain  (171  a.a.);  it  also  contains  the  highest  relative  serine  composition  (57.6%).  We 
compared  the  serine  codon  usage  from  each  of  the  domains  and  found  a  consistent 
pattern:  TCX  repeats  are  more  prevalent  at  the  5 'end  while  AGY  codons  are  more 
prevalent  at  the  3'  end.  Finally,  of  the  six  possible  serine  codons,  AGC  was  invariably 
the  dominant  codon  in  all  five  vertebrate  polyserine  domains. 

Discussion 


We  present  the  first  complete  teleost  Vtg  cDNA  sequence  along  with  its  translated 
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primary  structure.  F.  heteroclitus  Vtg  shares  37%  -  38%  identity  with  other  vertebrate 
Vtgs  and  it  includes  the  characteristic  N-terminal  Lvl  region,  an  internal  Pv  region  and 
a  C-terminal  Lv2  region.  The  genetic  organization  of  the  polyserine  domain  is  consistent 
with  that  found  in  other  vertebrates,  from  lamprey  to  chicken,  suggesting,  at  the  latest, 
a  pre-gnathostome  arrival  of  this  domain  into  the  Vtg  gene.  In  contrast  to  other 
vertebrate  Vtgs,  F.  heteroclitus  Vtg  is  predicted  to  be  ~  100  amino  acids  shorter,  and 
contains  a  polyserine  region  with  a  10-20%  higher  relative  serine  composition  than  the 
other  vertebrates  Vtgs.  We  suspect  that  the  occurrence  of  liquid  phase  yolk  in  F. 
heteroclitus  is  in  part  due  to  differences  within  its  Vtg  polyserine  domain  as  compared 
with  the  polyserine  domains  of  insoluble  yolk  producers.  The  higher  than  usual  relative 
serine  composition  would  eventually  be  modified  into  a  polyphosphoserine  domain, 
endowing  the  resulting  Pv  yolk  protein  with  an  uncommonly  strong  hydrophilic  potential. 

On  examination  of  the  alignment  in  Figure  2.4,  the  conserved  organization  of 
vertebrate  Vtg  is  evident:  two  well-aligned  termini  interrupted  by  a  polymorphic 
polyserine  domain.  The  degree  of  Vtg  conservation  among  several  oviparous  species  is 
further  resolved  by  the  phylogenetic  tree  analysis  presented  in  Figure  2.5.  The  results 
of  the  branch-and-bound  tree  search  suggest  that  the  present  structure  of  F.  heteroclitus 
Vtg  represents  a  substantial  history  of  divergence  from  the  ancestral  osteichthyean  Vtg. 
Although,  phylogenetically,  F.  heteroclitus  and  A.  transmontanus  are  considered 
monophyletic  as  actinopterygian  fishes  (Nelson,  1989),  the  Vtg  structure  of  A. 
transmontanus  was  found  to  be  more  closely  related  to  the  Vtgs  of  the  two  tetrapods  than 
it  was  to  that  of  F.  heteroclitus.  Indeed  many  character  traits  of  the  genus  Acipenser 
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have  long  been  recognized  as  tetrapod-like,  ie.  a  holoblastic  embryonic  cleavage  and 
anuran-like  gastrulation  (Balinksky,  1965;  Beer,  1981;  Conte  et  al.,  1988),  an  acrosome- 
capped  spermatozoan  (Conte  et  al.,  1988),  and  development  of  oviducts  from  true 
Mullerian  ducts  (Conte  et  al.,  1988).  We  suggest  that  the  structure  of  F.  heteroclitus  Vtg 
represents  a  derived,  perhaps  more  specialized,  example  of  Vtg  structure  in  contrast  to 
the  tetrapod/chondrostean  Vtg,  which  more  likely  resembles  the  Vtg  of  an  ancestral 
osteichthyean.  If  this  is  the  case,  we  would  predict  that  the  structure  of  an  elasmobranch 
Vtg  (especially  from  a  less  derived  species)  would  also  resemble  the  tetrapod  Vtgs  more 
closely  than  it  would  a  teleostean  Vtg.  Whether  the  structure  of  lamprey  Vtg  represents 
an  independent  derivation,  or  an  even  earlier,  prototypical  vertebrate  Vtg,  is  difficult  to 
surmise.  This  question  will  be  more  easily  answered  once  a  protochordate  or 
invertebrate  deuterostome  Vtg  (from  within  the  "Vtg  family)  has  been  sequenced.  Within 
the  invertebrate  outgroup  of  our  phylogram,  the  two  insect  Vtgs  appear  to  be  highly 
derived  versions  of  Vtg  structure  as  compared  to  the  C.  elegans  Vtg.  The  C.  elegans 
Vtg  is  substantially  more  similar  to  vertebrate  Vtgs  than  are  the  Vtgs  of  the  two  insects, 
suggesting  a  faithfulness  of  the  nematode  Vtg  to  an  ancestral  form  originating  in  a 
predecessor  common  to  both  vertebrates  and  platyhelminthes. 

In  reference  to  past  alignments  between  multiple  Xenopus  and  chicken  Vtgs, 
Byrne  (1989)  described  Pv  as  an  independently  evolving  domain  within  Vtg.  Our 
alignment  confirms  this  suggestion.  While  the  two  Lv  domains  of  Vtg  can  be  well 
aligned  among  several  organisms,  the  polyserine  domain  exists  in  a  wide  range  of  sizes 
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Figure  2.5  Branch-and-bound  phylogenetic  tree  analysis  comparing  selected  Vtgs 
spanning  600  million  years  of  divergence  (Raff  et  al.,  1989).  PAUP 
(Swofford,  1992)  analysis  was  done  on  a  ClustalV  alignment  (Higgins  et 
al.,  1992)  containing  five  of  the  vertebrate  cDNAs  from  Fig.  2.4.:chicken 
Gallus  domesticus  Vtg  II;  clawed  frog  Xenopus  laevis  Vtg  A2;  white 
sturgeon  Acipenser  transmontanus  Vtg;  mummichog  Fundulus  heteroclitus 
Vtg;  silver  lamprey  Ichthyomyzon  unicuspis  Vtg;  plus  three  invertebrate 
Vtg  cDNAs;  nematode  Caenorhabditis  elegans  Vtg  5;  boll  weevil 
Anthonomus  grandis  Vtg;  and  mosquito  Aedes  aegypti  Vtg.  The  Gallus 
Vtg  was  designated  as  the  reference  sequence  and  the  C.  elegans  Vtg  was 
defined  as  the  outgroup.  The  number  of  reconstructed  changes  in  amino 
acid  sequence  occurring  along  each  branch  are  shown  without  parentheses; 
bootstrap  data  are  depicted  at  partition  boundaries  as  percentages  in 
parentheses. 
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from  being  completely  absent  in  C.  elegans  (not  shown;  Speith  et  al.,  1985),  to  a  small 
size  in  F.  heteroclitus  (99  Ser  within  171  a.a.  region),  to  a  larger  size  in  the  chicken 
Gallus  (132  Ser  within  291  a.a.  region).  A  trend  emerges  in  consideration  of  these  data: 
as  one  proceeds  up  the  vertebrate  phylogenetic  ladder,  Vtg  polyserine  domains  appear 
to  increase  in  size.  However,  at  least  two  exceptions  to  this  trend  have  been  reported: 
the  lamprey,  Ichthyomyzon  Vtg  possesses  a  polyserine  domain  larger  than  that  of  F. 
heteroclitus  (113  Ser  within  238  a.a.  region;  Fig.  2.2)  and  the 

Gallus  Vtg  III  (Byrne  et  al.,  1989)  contains  a  small  polyserine  domain  (37  Ser;  not 
shown). 

Although  the  vertebrate  Vtg  polyserine  domains  vary  in  size  and  serine  content 
as  described  above,  their  genetic  organizations  have  sustained  an  element  of  similarity. 
At  the  DNA  level,  the  F.  heteroclitus  polyserine  domain  contains  a  distinct  cluster  of 
TCX  serine  codons  directly  preceding  a  larger  cluster  of  AGY  serine  codons  (Fig.  2.2), 
a  pattern  that  is  found  in  all  other  vertebrate  Vtg  cDNAs.  When  this  cluster  organization 
was  observed  by  Byrne  et  al.  (1989)  in  Xenopus  and  chicken  Vtgs,  it  was  speculated  that 
a  non-tetrapod  Vtg  would  perhaps  contain  a  cluster  of  only  one  type  of  serine  codon, 
representing  the  original  trinucleotide  repeating  unit,  and  thus  the  original  Vtg  polyserine 
domain.  However,  the  polyserine  domains  presented  here  from  the  lamprey,  sturgeon, 
and  mummichog  are  all  dominated  by  the  same  two  serine  codons  as  is  seen  in  Xenopus 
and  chicken,  suggesting  that  these  two  codon  clusters  have  been  present  within  the  Vtg 
gene  since  before  the  divergence  of  agnathans  and 
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Figure  2.6.  A  comparison  of  the  serine  codon  usage  in  the  polyserine  domains  (see 
Fig.  2.4)  of  five  vertebrate  Vtgs.  Although  the  number  of  trinucleotide 
repeats  vary,  the  overall  codon  structure  is  conserved:  a  cluster  of  TCX 
codons  at  the  5'  end  precedes  a  larger  cluster  of  AGY  codons.  Only  TCX 
or  AGY  codons  are  shown.  Relative  lengths  of  polyserine  domains  are 
drawn  to  scale. 
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gnathostomes,  over  400  million  years  ago  (Lovtrup,  1977).  Chen  et  al.  recently 
described  a  mosquito  cDNA  sequence  which  codes  for  a  Vtg  containing  three  separate 
polyserine  domains;  82%  of  the  serines  in  these  domains  are  coded  for  by  the  TCX 
codon.  Since  insects  are  a  highly  derived  group,  it  remains  unclear  whether  the  TCX 
repeats  represent  the  conservation  of  a  primitive  polyserine  coding  domain  or  an 
incidence  of  convergent  evolution  between  separate  Vtg  clades. 

It  has  been  theorized  that  the  phosphoserine  clusters  of  Pv,  documented  to  bind 
Ca++  in  a  1:1  stoichiometric  ratio  in  Xenopus  (Follet  and  Redshaw,  1968;  Munday  et  al., 
1968;  Wallace,  1970)  are  necessary  for  early  bone  mineralization  in  vertebrate  embryos. 
Even  more  speculative  is  the  idea  that  the  phosphoserine  tracts  of  Vtg  were  a  necessary 
pre-adaptation  allowing  the  original  evolutionary  emergence  of  ossified  bone  in  ancestral 
chordates.  Both  the  lamprey  and  the  sturgeon  are  examples  of  cartilaginous  vertebrates, 
albeit  with  bony  ancestors  (Jarvik,  1980),  that  have  retained  their  Vtg  polyserine 
domains.  Thus,  the  possession  of  a  Vtg  polyserine  domain  is  not  universally  concomitant 
with  the  possession  of  a  bony  skeleton.  Indeed,  it  appears  that  polyserine  domains  can 
no  longer  be  considered  an  exclusive  vertebrate  Vtg  characteristic.  Recent  reports  by 
Chen  et  al.  (1994)  describing  a  mosquito  (Aedes  aegypti)  Vtg  cDNA  and  Yano  et  al. 
(1994)  describing  a  silkworm  (Bombyx  mori)  Vtg  cDNA,  provide  invertebrate  sequences 
containing  various  arrangements  of  polyserine  tracts.  These  findings  suggests  a  pre- 
chordate  origin  of  Vtg  polyserine  domains  and  challenges  the  hypothesis  of  Pvs  being 
unique  to  chordates.  However,  polyserine  domains  are  not  synonymous  with  true  Pv 
domains.  Whether  these  invertebrate  polyserine  tracts  are  highly  phosphorylated  and 
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cleaved,  as  are  bona  fide  Pv  proteins,  has  not  yet  been  reported.  It  is  possible  that 
polyserine  domains  have  existed  within  Vtgs  since  before  the  emergence  of  chordates, 
but  that  Pv  proteins,  per  se,  remain  a  unique  chordate  trait,  representing  a  novel 
modification  and  utilization  of  these  polyserine  tracts. 

Though  we  know  little  of  why  Vtg  polyserine  domains  vary  in  size,  findings  from 
studies  of  heritable  disease  may  offer  clues  as  to  how  these  size  differences  originated. 
The  aberrant  amplification  of  trinucleotide  repeats  from  one  generation  to  another  has 
recently  been  coupled  to  the  occurrence  of  several  human  genetic  diseases  including 
Huntington's  Disease  (Huntington's  Disease  Collaborative  Research  Group,  1993;  review 
by  Caskey  et  al.,  1992).  An  increased  potential  for  trinucleotide  amplification  may 
explain  the  faster  rate  of  evolution  attributed  to  the  Pv  region  in  comparison  to  its  two 
flanking  Lv  regions  (Byrne  et  al.,  1989).  We  are  aware  of  very  few  descriptions  of 
"yolk-based  diseases"  in  fish  (Olin  and  von  der  Decken,  1989),  and  in  these  it  was 
neither  suspected  nor  tested  whether  the  disease  was  caused  by  aberrant  amplification  of 
the  Pv  polyserine  domain.  Diseases  aside,  novel  duplications  or  omissions  in  the  Pv 
polyserine  domain  may  certainly  have  affected  the  evolution  of  specific  yolk  structures 
or  functions.  F.  heteroclitus ,  possessing  the  smallest  polyserine  domain  of  our 
alignment,  produces  a  yolk  which  remains  totally  soluble  throughout  oocyte  development. 
As  the  smaller  yet  serine-enriched  polyserine  domain  of  F.  heteroclitus  Vtg  is 
phosphorylated  and  finally  processed  into  a  more  soluble  Pv  yolk  protein,  it  may 
somehow  be  prevented  from  re-combining  with  the  Lv  yolk  proteins  and  forming  the 
insoluble  yolk  complexes  of  other  vertebrates.   Another  possible  explanation  for  the 
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persistence  of  a  liquid  phase  yolk  in  F.  heteroclitus  oocytes  is  that  the  high  proteolytic 
activity  documented  by  Greeley  et  al.  (1986)  prevents  the  recombination  of  Pv  and  the 
Lvs  into  their  usual  insoluble  particles.  By  obtaining  more  examples  of  Vtg  protein 
structure  from  other  liquid  phase  yolk  producers,  a  more  substantial  and,  hopefully, 
causal  difference  between  soluble  and  non-soluble  yolk  will  materialize. 

In  conclusion,  the  F.  heteroclitus  Vtg  cDNA  along  with  its  amino  acid  translation 
represents  the  first  complete  Vtg  sequence  documented  from  a  teleost  fish.  The  predicted 
primary  structure  suggests  to  us  that  a  heightened  proportion  of  phosphoserine  in  the 
polyserine  domain  endows  the  F.  heteroclitus  Pv  yolk  proteins  with  a  higher  solubility 
preventing  the  formation  of  non-soluble  yolk  particles  as  is  seen  in  many  other 
vertebrates.  Knowledge  of  the  complete  primary  structure  of  F.  heteroclitus  Vtg 
provides  us  with  useful  information  for  mapping  the  extensive  proteolytic  processing  of 
native  Vtg  into  its  respective  yolk  proteins.  We  hope  that  this  sequence  will  aid 
investigators  of  other  vertebrate  Vtgs  by  providing  a  piscine  model  for  molecular  probes 
and  antibodies.  Finally,  we  have  provided  yet  another  example  of  the  evolutionary 
independence  of  Pv  within  the  Vtg  gene,  where  the  codon  cluster  organization  is 
preserved,  yet  the  size  of  the  serine  clusters  and  intervening  regions  remains  quite 
unpredictable. 


CHAPTER  3 

SEQUENCE  COMPARISON  OF  FUNDULUS  HETEROCLITUS 
VITELLOGENINS  I  AND  II 

Introduction 

Vitellogenin  gene  families  have  been  described  from  various  metazoan  species 
including  Xenopus  laevis  (Wahli  et  al.,  1979;  Wiley  and  Wallace,  1980;  Tata  et  al.  1980), 
Caenorhabditis  elegans  (Blumenthal  et  al. ,  1984),  and  chicken  (Evans  et  al. ,  1987;  Byrne 
et  al.,  1989).  These  small  gene  families  from  individual  species  have  likewise  been 
shown  to  share  genomic  organization  and  sequence  identity,  establishing  these  related  Vtg 
genes  as  members  of  an  ancient  gene  superfamily  (Speith  et  al.,  1985;  Nardelli  et  al., 
1987;  Byrne  et  al.,  1989;  Speith  et  al.,  1991). 

The  existence  of  four  X.  laevis  Vtg  genes  can  be  partially  explained  by  the 
hypothesis  that  an  ancient  duplication  occurred  in  the  X.  laevis  genome  (Thiebaud  and 
Fischberg,  1977;  Bisbee  et  al.,  1977).  Tata  et  al.  (1980)  reported  the  extraordinary 
occurrence  of  twelve  to  sixteen  Vtg  genes  in  X.  laevis,  stating  that  only  four  to  six  of 
them  were  in  an  expressible  form,  and  that  the  rest  were  nonexpressible  or  "silent". 
Another  example  of  a  silent  Vtg  gene  has  been  documented  among  the  six  Vtg  genes  of 
C.  elegans;  whereas  vit-2  to  vit-6  have  been  shown  to  encode  specific  YP  proteins,  vit-1 
has  been  described  as  a  pseudogene  (Speith  et  al.,  1985)  with  no  apparent  translation 
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product.  In  contrast,  the  three  Vtgs  genes  reported  from  the  chicken  include  no  apparent 
silent  genes.  The  primary  translation  products  Vtgl,  Vtgll,  and  VtgUI  were  found  to  be 
present  in  blood  in  a  ratio  of  0.33  :  1.00  :  0.08  confirming  chicken  Vtgll  as  the  major 
yolk  protein  precursor  (Wang  et  al.,  1983).  Recently  two  Vtgs  have  been  reported  in 
related  tilapia  species,  Oreochromis  aureus  (Ding  et  al.,  1989)  and  O.  mossambicus 
(Kishida  and  Specker,  1992).  These  studies  established  the  occurrence  of  two  piscine 
Vtgs  (180  kDa  and  130  kDa)  using  an  immunological  approach.  The  immunological  data 
from  O.  aureus  was  additionally  complemented  by  a  small  nucleotide  sequence  from  the 
C-terminus  of  one  of  the  purported  Vtgs,  probably  the  larger  (Ding  et  al.,  1990). 
Though  the  existence  of  multiple  Vtgs  has  been  established  in  these  species,  it  remains 
unclear  as  to  why  several  Vtgs  would  be  functionally  necessary. 

We  have  recently  reported  the  cDNA  sequence  and  predicted  primary  structure 
of  Fundulus  heteroclitus  Vtg  I,  as  a  precursor  to  non-crystalline,  liquid  phase  yolk 
proteins  (LaFleur  et  al. ,  1995).  Here  we  describe  a  second  F.  heteroclitus  Vtg  cDNA  and 
protein  sequence  that  we  have  designated  as  Vtg  II.  The  predicted  primary  structure 
shares  45%  identity  with  Vtg  I  (with  regions  as  high  as  65%)  and  contains  the  same 
general  domain  profile:  a  large  lipoveitellin  1  region,  followed  by  a  serine-rich,  phosvitin 
region  and  terminating  in  lipovitellin  2  region.  We  have  confirmed  Vtg  II  MRNA 
expression  as  well  as  a  derived  yolk  protein  cleavage  product,  verifying  that  Vtg  II 
represents  a  separate  but  functional  Vtg.  This  report  therefore,  establishes  the  existence 
of  a  bona  fide  Vtg  gene  family  in  F.  heteroclitus  that  acts  as  a  precursor  to  liquid  phase 
yolk  proteins. 
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Material  and  Methods 

Chemicals 

Estradiol- 17/3  was  obtained  from  Sigma  Chemical  Co.  (St.  Louis,  MO). 
Radioisotopes,  [a-32P]dCTP  and  [a-35S]dATP,  were  purchased  from  New  England 
Nuclear  (Boston  MA).  Lambda  gtlO  vector,  cDNA  synthesis  reagents,  the  subcloning 
plasmid  Pgem-T,  and  T4  ligase  were  obtained  from  Promega  (Madison,  WI).  All 
amplification  reactions  were  performed  using  a  50:1  mixture  of  Taq  DNA 
polymerase: cloned  Pyrococcus  Juriosus  DNA  polymerase  (Stratagene,  La  Jolla,  CA). 
All  sequencing  gels  were  cast  using  Sequagel-8  (National  Diagnostics,  Atlanta) 
polyacrylamide  reagents.  Sequenase  version  2.0  DNA  polymerase  and  dideoxy 
sequencing  reagents  were  obtained  from  US  Biochemicals  (Cleveland,  OH).  Reagents 
for  random-primed  labeling  of  probes  were  purchased  from  Pharmacia  (Piscataway,  NJ). 
Magna  nylon  transfer  membranes  were  used  for  nucleic  acid  transfers  and  purchased 
from  MSI  (Westboro,  MA).  Amino  acid  N-terminal  sequencing,  synthesis  of 
oligonucleotide  primers,  and  a  limited  amount  of  DNA  sequencing  were  performed  by 
the  University  of  Florida  Interdisciplinary  Center  for  Biotechnology  Research  core 
facilities. 

Cloning  strategy  using  an  estrogen-induced  liver  cDNA  library 

Seven  of  the  eight  overlapping  clones  resulting  in  the  contiguous  cDNA  sequence 
were  isolated  from  a  XgtlO  liver  library  whose  synthesis  has  been  previously  described 
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(LaFleur  et  al.  1995)  In  brief,  the  library  was  constructed  from  the  pooled  mRNA  of 
six  male  Fundulus  heteroclitus  that  had  been  treated  with  two  IP  injections  of  estradiol- 
17/3  (0.01  mg/g  body  weight).  The  library  contained  an  initial  titer  of  only  6  X  104  total 
plaque-forming  units,  and  had  been  amplified  twice. 

The  initial  Vtg  II  clone  was  discovered  using  the  degenerate  primer  ROW  19,  and 
the  vector  primer  NEB  1231,  with  5  /xl  of  the  XgtlO  library  as  template  in  a  PCR 
reaction  utilizing  a  50: 1  mixture  of  Taq  DNA  polymerase :Pfu  DNA  polymerase.  ROW 
19  was  designed  to  match  a  conserved  region  of  Vtgs,  ranging  from  C.  elegans  to 
chicken  (Fig.  3.1).  A  550  bp  band  was  isolated,  inserted  into  pGem-T,  sequenced  and 
revealed  to  be  a  second  Vtg  cDNA  that  we  designated  as  Vtg  II.  This  insert  was  then 
isolated  and  used  to  generate  a  random  primed  32P-labeled  probe.  The  library  was  plated 
out  on  150-mm  petri  dishes  by  transferring  E.  coli  C600hfl  cells,  and  overlaying  them 
in  agarose  atop  agar  plates  containing  25  /xg/ml  tetracycline.  Duplicate  plaque  lifts  were 
carried  out  using  Magna  nylon  membranes,  and  these  were  probed  at  65°C  in  0.05  X 
BLOTTO,  6  X  SSC  (150  mM  NaCl,  15  mM  sodium  citrate,  pH  7)  overnight.  A  large 
proportion  of  the  plaques  were  found  to  be  positive,  and  20  agarose  plugs  were  isolated 
and  stored  in  SM  buffer  (.1  M  NaCl,  8  mM  MgS04,  50  mM  Tris,  2%  gelatin)  at  4°C 
with  a  drop  of  chloroform.  Thereafter,  plug  lysates  from  these  Vtg  II  positive  plugs 
were  used  in  amplification  reactions  targeting  Vtg  II  positive  clones  in  a  successively 
overlapping  5'  direction.  Six  more  Vtg  II  clones  were  isolated  in  this  manner 
approaching  the  initial  methionine  codon,  but  several  attempts  at  targeting  the  last  few 
nucleotides  to  include  the  initial  methionine  failed. 
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Funduius  heteroclitus  Vitellogenin  II  cDNA  5195  bp 


ROW  55 


pFhv2a • 
ROW  19 


pFhv2h 


pFhv2b 1 


PFhv2g  pFhv2f 


pFhv2c 


pFhv2e 


pFhv2d ■ 


Figure  3.1  Cloning  strategy  used  in  isolating  the  F.  heteroclitus  Vtg  II  cDNA  (5166 
bp).  Seven  inserts  (pFhv2a  thru  g)  were  isolated  from  the  XgtlO  liver 
library  by  anchored  PCR  with  indicated  oligonucleotide  primers  and 
inserted  into  the  pGem-T  cloning  vector.  The  final  cDNA  (pFhv2h)  was 
isolated  by  RACE  using  reverse  primer  ROW  55. 
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A  protocol  for  rapid  amplification  of  cDNA  ends  (RACE;  Frohman  et  al.,  1988), 
was  performed  to  retrieve  this  region  using  total  RNA  (described  below)  isolated  from 
the  liver  of  an  individual  reproductively  active  female  F.  heteroclitus.  A  first  strand 
synthesis  reaction  was  performed  using  0.5  fig  total  RNA,  the  primer  ROW  55  and 
Superscript  RT,  followed  by  addition  of  a  "poly-C  tail"  using  4  /xl  of  1.0  mM  dCTP,  and 
10  units  of  terminal  deoxynucleotidyl  transferase  (BRL).  Then,  an  amplification  reaction 
was  carried  out  using  the  forward  primer  ROG  51,  which  targeted  the  poly  C-tail,  along 
with  the  reverse  primer  ROW  55,  and  the  Taq  DNA  polymerase.  Through  this  effort 
we  successfully  isolated  a  230-bp  band  that  was  inserted  into  pGem-T,  sequenced  and 
found  to  include  a  valid  methionine  codon,  preceded  by  a  short  region  that  fit  the  criteria 
for  a  transcription  start  site  (Kozak,  1991). 

Estrogen  treatment.  RNA  isolation  and  analysis 

Male  and  female  F.  heteroclitus  were  collected  from  the  estuarine  creeks  adjacent 
to  the  Whitney  Laboratory,  and  were  maintained  in  running  seawater  tanks  under 
14L:10D  photoperiod  conditions  at  25  ±  2°C.  Fish  were  maintained  for  at  least  one 
month  before  being  used  for  RNA  collections. 

Experimental  groups  of  fish  were  subjected  to  two  intraperitoneal  injections  of 
estradiol- 17/3  (0.01  mg/g  body  weight)  dissolved  in  50  fil  coconut  oil  (Kanungo  et  al. 
1990).  Control  groups  were  sham-injected  with  coconut  oil  alone.  The  first  injection 
was  performed  on  day  1,  the  second  injection  on  day  4,  followed  by  sacrifice  and  liver 
dissection  on  day  8. 
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Total  RNA  was  isolated  from  livers  by  extraction  with  RNA  Stat-60  reagents 
(Tel-Test  "B",  Inc.  Friendswood,  TX).  Tissues  were  dissected  and  immediately  frozen 
in  1.5-ml  tubes  containing  500  pi  of  RNA  Stat-60  emulsion  by  immersion  in  liquid 
nitrogen.  Tissues  were  homogenized  at  20°C  using  a  Kontes  pestle  and  motor. 
Typically,  a  300  mg  liver  yielded  0.350  mg  total  RNA,  with  O.D.  260/280  ratios 
consistently  above  1.8.  Total  RNA  samples  were  resuspended  in  diethyl  pyrocarbonate- 
treated  water  and  stored  at  -80°C  until  used  in  analyses. 

Before  electrophoresis,  aliquots  of  15  /xg  total  RNA  were  precipitated  in 
isopropanol,  and  denatured  in  2.2  M  formaldehyde,  50%  formamide,  50  mM  MOPS  (pH 
7.0)  for  30  min  at  65°C.  Samples  were  electrophoresed  through  gels  containing  1.0% 
agarose,  0.6  M  formaldehyde,  50  mM  MOPS,  and  1  mM  EDTA  for  2.0  hours  at  3.5 
V/cm  gel  in  50  mM  MOPS,  1  mM  EDTA  running  buffer.  RNA  was  blotted  onto  Magna 
nylon  membranes  by  capillary  action  with  20  X  SSC,  immobilized  by  U.V.  crosslinking 
and  visualized  by  staining  briefly  with  methylene  blue. 

Random-primed  [32P]probes  were  made  for  resolving  Vtg  I  and  Vtg  II  RNA 
transcripts.  The  Vtg  I  probe  was  synthesized  from  a  PCR  product  off  of  the  template 
pMMBl  using  primers  ROW  5  and  MB  13,  resulting  in  a  639-bp  cDNA  probe  from 
nucleotide  4284  to  4923  of  the  Vtg  I  cDNA.  The  Vtg  II  probe  was  made  from  pFhv2a 
using  primers  ROW  19  and  ROW  33,  yielding  a  277-bp  probe  from  nucleotide  4692  to 
4969  of  the  Vtg  II  cDNA.  After  random  prime  labeling,  oligonucleotide  probes  were 
separated  from  non-incorporated  [32P]dCTP  by  size  chromatography  through  Stratagene 


Figure  3.2     Translated  amino  acid  sequence  (1687  residues)  of  the  putative  F. 

heteroclitus  Vtg  II  polypeptide.  The  signal  peptide,  predicted  by 
the  method  of  von  Heijne  (1986)  is  indicated  by  underlining,  and 
verified  by  the  N-terminal  sequence  obtained  from  an  isolated  69- 
kDa  yolk  protein  (shaded  lettering).  The  annealing  site  of  ROW 
19,  used  to  isolate  the  initial  insert,  is  indicated  by  double 
underlining.  A  polyadenylation  site  is  indicated  by  underlining. 


aatrcaccagcc  12 


ATGAGGGTGCTTGTGCTGGCTCTCACTGTGGCCCTTGTGGCCGGGAACCAGGTGAGCTATGCCCCA  78 
H     R     V     L     V     L     &.    L.     T     V     A     L     V     A     <5,.  N     Q     V     S     *     A  P 

GAATTTGCCCCTGGAAAGACCTACGAGTACAAGTATGAAGGTTATATTC7GGGTGGCCTGCCTGAG  144 
EFAPGKTYEJKySGYILGGLPE 

GAGGGCCTGGCAAAGGCTGGGGTGAAGATCCAGAGCAAAGTCTTGATCGGTGCAGCAGGTCCTGAC  210 
EGLAKAGVKIQSKVLIGAAGPD 

AGCTACATTCTGAAACTTGAAGACCCTGTCATCTCGGGGTACAGTGGCATTTGGCCTAAAGAGGTT  276 
SYILKLEDPVISGrSGIWPKEV 

TTCCACCCTGCCACAAAGCTCACCTCAGCTCTCTCTGCTCAGCTCTTGACACCCGTCAAGTTTGAG  342 
FHPATKLTSAI.SAQLLTPVKFS 

TATGCCAACGGAGTGATCGGAAAAGTGTTCGCACCTCCAGGCATCTCTACAAATGTGCTGAATGTC  408 
YANGVIGKVFAPPGISTNVLNV 

TTCAGGGGAC7CCTCAACATG7TTCAGATGAACATCAAGAAGACTCAGAATGTGTATGACCTGCAA  474 
FRGLLNMFQMN  IKKTQNVYDLQ 

GAGACTGGAGTAAAAGGTGrGTGCAAGACACACTATATCCTTCATGAGGACTCCAAGGCTGATCGC  540 
ETGVKGVCKTHYILHEDSKADR 

CTCCACTTGACGAAAACCACAGACCTGAATCACTGCACCGACAGCATCCACATGGATGTTGGCATG  606 
LHLTKTTOLNHCTDS  IHMDVGM 

GCTGGTTATACGGAAAAATGTGCAGAG7GCATGGCTCGGGGAAAAACTCTTTCAGGAGCAATTTCT  672 
AG!fTEKCASCMARGKTLSGAIS 

GTCAACTACATCATGAAGCCGTCTGCCTCTGGCACCTTGATCCTAGAGGCAACCGCCACTGAGCTT  738 
VNYIMKPSASGTLILEATATEL 

CTCCAGTACTCGCCCG7CAACATTGTAAATGGAGCTGTCCAGATGGAGGCTAAGCAGACCGTGACC  304 
LQYSPVNIVNGAVQMEAKQTVT 

i 

TTCGTGGACATCAGGAAGACCCCAT7AGAGCCCCTCAAAGCAGACTATATTCCCCGTGGATCGCTC  370 
FVDIRXTPLEPLKADYIPRGSL 

AAGTACGAGTTAGGCACTGAATTCCTACAGACACCAATTCAGCTTCTGAGGATCACCAATGTCGAG  936 
KYELGTEFLQ7P  IQLLRITNVE 

GCTCAGATTGTTGAGTCTCTGAACAACCTAGTGAGCCTCAATATGGGCCATGCCCATGAGGATTCC  1002 
AQIVESLNNLVSLNMGHAHEDS 

CCTCTGAAGTTTATTGAGCTCATCCAGCTGCTGCGTGTGGCCAAGTATGAGAGCATTGAAGCTCTC  1068 
PLKFIEI.IQLLRVAKXESIEAL 

TGG  AGTCAGTTTAAAACCAAAATTGATCACAGGCACTGGTTGCTGAGCTCTATCCCTGCCATTGGT  1134 
WSQFKTKIDHRHWLLSS  IPAIG 

ACTCATGTTGCTCTCAAGTTCATCAAGGAGAAGATCGTTGCTGGTGAAGTCACTGCTGCTGAGGCT  1200 
THVALKFIKEKIVAGEVTAASA 

GCTCAGGCCATCATGTCATCT  ACACACTTGGTGAAGGCCGACCTGGAGGCAATCAAGCTTCAGGAG  1266 
AQAIMSSTHLVKADLEAIKLQS 

GGCCTGGCTGTGACCCCTAATATTCGGGAAAATGCAGGTTTGCGTGAACTCGTTATGCTGGGCTTT  1332 
GLAVTPNIRENAGLRELVMLGF 
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GGCATCATGSrrCACaAATACTGTGTGGAGaACCCTTCATGTCCATCTGAGCrCOTCaGCCCAGTT  1398 
GIMVHKYCVEMPSCPSELVRPV 

CATGACATTArrGCCAAGGCTCTTGAGAAACGCGACAATGATGAGCTCTCCCTCGCTCTCAAAGTT  1464 
HD      I  IAKALEKRDNDSLSLALKV 

CTGGGTAATGCCGGACATCCCAGCAGCCTGAAGCCAATCATGAAACTTCTTCCTGGCTTTGGCAGC  1530 
LGNAGHPSSLKPIMKLLPGFGS 

TCTGCCTCCGAACTTGAGCTCAGAGTTCACATTGACGCTACACTGGCGCTGAGGAAAATTGGCAAG  1596 
SASELELRVHIDATLALRKIGK 

AGAGAACCCAAGATGATTCAGGATGTGGCCCTTCAGCTCTTCATGGACAGGACTCTTGACCCAGAG  1662 
REPKMIQ.DVALQLFMDRTLDPE 

CTCCGTATGGT7GCTG7TGTTGTGCTGTTTGATACCAAGCTACCTATGGGTCTGATAACCACTCTC  1728 
LRMVAVVVLFDTKLPMGLITTL 

GCTCAGAGTCTCCTGAAACAGCCAAACCTGCAGGTCCTTAGCTTTGTCTACTCTTACATGAAGGCC  1794 
AQSLLKEPMLQVLSFVYSYMKA 

TTCACCAAGACCACCACCCCGGACCATTCCACTGTAGCCGCTGCCTGCAATGTTGCCATCAGGATC  1860 
FTKTTTPOHSTVAAACNVAIRI 

CTCAGCCCAAGATTCGAAAGACTGAGCTACCGCTACAGCCGAGCTTTCCATTATGACCACTATCAT  1926 
LSPRFERLSYRYSRAFHYDHIH 

AATCCTTGGATGCTGGGAGCTGCTGCCAGCGCATTTTACATCAATGATGCCGCGACTGTATTGCCA  1992 
NPWMLGAAASAFyiNDAATVLP 

AAAAACATCATGGCAAAAGCTCGCGTTTACCTCTCTGGAGTGTCTGTTGATGTTCTGGAGTTTGGA  2053 
KNIMAKARVYLSGVSVDVLEFG 

GCCAGAGCTGAAGGAGTGCAAGAGGCCCTTTTGAAAGCCCGTGATGTTCCTGAGAGTGCAGACAGG  2124 
ARAEGVQEALLKARDVPESADR 

CTCACCAAGATGAAGCAAGCTCTTAAGGCrCTGACTGAGTGGAGGGCCAATCCrrCCCGCCAGCCT  2190 
LTKMKQALKALTEMRANPSRQP 

CTCGGCTCrcrGTACGTGAAGGTTCTTGGGCAGGATGTTGCATTTGCAAACATCGACAAAGAAATG  2256 
LGSLyVKVLGQDVAFANIDKSM 

GTTGAGAAGATCATTGAGTTTGCAACrGGACCTGAAATCCGCACCCGTGGCAAAAAGGCCTTGGAC  2322 
VEKIIEFATGPEIRTRGKKALO 

GCCCTGTTGTCTGGTTACTCTATGAAATACTCCAAGCCAATGTCGGCCATTGAGGTCCGTCACATC  2388 
ALLSGYSMKXSKPMSAIEVRHI 

TTCCCCACCTCTCTTGGTTTACCCATGGAGCTCAGTCTGTACACTGCTGCCGTGACAGCCGCATCC  2454 
FPTSLGLPMELSLYTAAVTAAS 

GTTGAAGTACAAGCCACCATTTCACCACCACTTCCCGAGGACTTCCATCCTGCCCACCTACTGAAG  2520 
VEVQATISPPLPSDFHPAHLLK 

TCTGATATTTCCATGAAGGCTTCAGTCACTCCAAGTGTATCTTTGCACACCTATGGAGTTATGGG A  2586 
SDISMKASVTPSVSLHTJGVMG 

GTGAATAGTCCTTTCATCCAGGCTTCTGTGCTGTCAAGAGCCAAAGACCATGCAGCTCTTCCCAAA  2652 
VNSPFIQ.ASVLSRAKDHAALPK 

AAGATGGAGGCAAGAC7TGACATAGTCAAGGGTTACTTTAGCTACCAGTTCC7GCCTGTTGAGGGT  2718 
KMEARLO  IVKGifFSYG.FI.PVEG 


Figure  3.2~continued 
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GT'AAAACAATTGCATCTGCTCGTCTTGAAACAGTTGCCATTGCAAGAGATGTTGAAGGCCTCGCT  2784 
VKTIASARLETVAIARDVEGLA 

GCTGCCAAAGTCACACCGGTTGTCCCATATGAGCCTATTGTGAGCAAGAACGCCACTTTAAATCTr  2850 
AAKVTPVVPyEPIVSKMATLML 

TCACAGATGTCTTACTATCTGAATGATAGCATATCAGCATCATCTGAACTTCTTCCTTTTTCGCTG  2916 
SQMSYYLNDSISASSELLPFSL 

CAAAGGCAAACTGGCAAAAATAAAATCCCCAAGCCCATTGTGAAGAAAATGTGTGCAACAACGTAT  2982 
QRQTGKNKIPKPIVKKMCATTY 

ACGTATGGGATTGAGGGCrGCGTTGACATTTGGTCTCGCAATGCAACCTTCCTCAGAAACACCCCC  3048 
TYGIEGCVDIWSRNATFLRNTP 

ATCTACGCCATAATTGGAAACCACTCTCTTTTGGTTAATGTTACCCCAGCTGCTGGACCGTCCATC  3114 
IYAIIGNHSLLVNVTPAAGPSI 

GAAAGGATCGAAATCGAGGTTCAGTTTGGTGAACAAGCAGCAGAAAAGATCCTTAAAGAGGTTTAC  3180 
ERIEIEVQFGEQAAEKILKEVY 

CTGAATGAGGAGGAAGAAGTACTTGAAG ACAAAAACGTCCTTATGAAGCrGAAGAAGATrCTGTCT  3246 
LNEEEEVLEDKSVLMKLKKILS 

CCTGGTCTGAAGAACAGCACCAAAGCTTCATCCTCTAGTTCGGGCAGCTCTCGCTCCAGTAGATCT  3  3 12 
PGLKMSTKASSSSSGSSRSSRS 

CGCTCCAGCAGCTCCAGCAGCTCCAGCAGCTCCAGCAGCTCCAGCCG7TCCTCCTCTAGCTCTTCC  3378 
RSSSSSSSSSSSSSSRSSSSSS 

AGGAGCTCTTCCTCTTTGCGCCGCAATAGCAAGATGTTGGATCTrGCCGATCCCCTCAACATAACA  3  444 
RSSSSLRRNSKMLDLADPLNIT 

TCAAAGAGATCCTCCAGCAGCTCCTCCAGCTCCAGCTCCTCCAGCTCCTCCAGCTCCTCCAGCTCC  3510 
SKRSSSSSSSSSSSSSSSSSSS 

TCCAGCTCCAAGACCAAGTGGCAGCTGCACGAAAGGAACTTCACCAAGGATCACATCCACCAGCAT  3576 
SSSKTKWQLHERNFTKDHIHQH 

TCCGTCTCAAAAGAACGTCTTAACAGCAAGAGCAGTGCGAGCAGCTTTGAATCCATTTACAACAAG  3642 
SVSKERLNSKSSASSFESIYNK 

ATCACATACCTGTCTAACATCGTCAGCCCAGTGGTCACAGTCCTTGTCCGTGCCATCAGAGCTGAC  3708 
ITKLSNIVSPVVTVLVRAIRAD 

CACAAGAACCAGGGGTATCAGATCGCTGTGTACTATGACAAACTCACTACCAGAGTGCAGATCATT  3774 
HKMQGyQIAVrifDKLTTRVQII 

GTGGCCAACCTCAC7GAAGATGACAACTGGAGAATCTGTTCTGACAGCATGATGCTCAGCCACCAC  3840 
VANLTEDDNWRICSDSMMLSH  H 

AAAGTGATGACTCGAGTCACCTGGGGCATTGGATGCAAGCAGTACAACACCACGATCGTGGCCGAA  3906 
KVMTRVTMGIGCKQYNTTIVAE 

ACTGGTCGCGTTGAGAAGGAGCCTGCCGTCCGTGTGAAGCTGGCCTGGGCCAGACTCCCTACTTAC  3972 
TGRVEKEPAVRVKLAWARLPTY 

ATCAGGGATTATGCAAGAAGAGTGTCCAGGTACATTTCCCGCGTCGCTGAGGACAATGGAGTGAAC  4038 
IRDYARRVSRlflSRVAEDMGVM 

AGGACAAAGGTCGCCAGTAAACCCAAAGAGATCAAACTGACTGTAGCTGTTGCCAACGAGACAAGC  4104 
RTKVASKPKEIKLTVAVANETS 


Figure  3.2-continued 


C7GAA7G7CACGC7GAA7ACACCAAAGAACACC77777CAAAC7GGGA7GGG77C77CCC7777AC  4170 
LNV7LNTPKN7FFKLGWVLPFY 

C7ACCAA77AACAACAC7GC7GC7GAGC7GCAGGCA77CCAGGGCAGG7GGA7GGACCAGG7CACA  42  3  6 
LP  INNTAAELQAFQGRWMDQVT 

7ACA7GC7CACCAAG7C7GC7GCAGC7GAG7GCACCG7GG77GAAGACACAG7GG7CAC777CAAC  4302 
XML7KSAAAEC7VVED7VVTFN 

AACAGGAAGTACAAAACGGAGACGCCCCACTCTTGCCATCAGGTCTTGGCTCAAGATTGCACATCr  4363 
NRKYKTETPHSCHQVLAQDCTS 

GAAA7CAAATTCA7AG7GC7GC7GAAGAGGGA7CAAACAGCAGAACGGAATGAGA7CAG7ATTAAG  443  4 
EIKFIVLLKRDQTAERIIEISIK 

A77GAAAACA7TGA7G7TGACA7G7ATCCCAAGGACAACGC7G77G7GG7GAAGG77AA7GGAG7A  4500 
IENIOVDMyPKDMAVVVKVMGV 

GAAA77CC7C7CACCAACC7GCCATATCAGCA7CCAACAGGCAACA7ACAGA7CC3ACAAAGAGAA  4565 
EIPLTNLPYQHPTGNIQIRQRS 

GAGGGCA7C7C7C7GCA7GC7CCCAG7CA7GGCC77CAGGAGG7C77CC7CAG7T7AAACAAAG7G  4632 
EGISLHAPSHGLQEVFLSLNKV 

CAGG77AAAG77G77GAC7GGA7GAGAGGCCAGACG7G7GGGC7C7GCGGAAAGGCCGACGGGGAA  4698 
QVK'.  VOWMRGQTCGLCGXADGE 

G7CAGACAGGAG7ACAGCAC7CCCAA7GAACGGG7G7CCAGGAACGCAACCAGCT7CGC7CATTCC  4764 
VRQEYSTPNERVSRNA7SFAHS 

TGGG7GC7GCC7GCAAAGAGC7GCCG7GACGCC7CAGAG7GC7ACA7GCAACTTGAA7CGG7GAAG  4830 
WVLPAKSCRDASECyMQLESVK 

C7CGAGAAACAGATCAGCC7GGAAGGCGAGGAA7CCAAA7GC7AC7CAG7CGAACCTG7C7GGCGC  4896 
LSKQISLEGEESXCrSVEPVWR 

7G7C7CCC7GGC7G7GCACCAG7GAGAACCACC7CCG7CAC7G7CGGGC7ACCA7GCG7G7C7C7G  4962 
CLPGCAPVR77SV7VGLPCVSL 

GA77CAAACC7GAA7CGC7C7GA7AG7C7CAGCAGCA7C7A7CAGAAGAGCG77GACG7GAGCGAG  5028 
OSNLKRSDSLSSirQKSVOVSE 

ACGGCAGAG7CCCACC7GGCC7G7CGCTGCAC7CC7CAG7G7GCC7AAacgrgtt:gcctcc=gacc  5094 
TAESHLACRC7PQCA- 

etcgttcrgttrttgcgttatatggatgctcgcaaactaaaataaagaagcaactaaaaaaaaaaa  5160 
aaaaaat==agcrtggacctaaccaggcrgaacct  5195 
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Nuc-Trap  columns.  All  RNA  hybridizations  were  carried  out  at  65°C  in  1  X  Denhardt's 
solution,  6  X  SSC,  and  0.1  %  SDS  without  formamide  (Denhardt  1966). 
Autoradiographs  were  analyzed  using  the  Bio  Image  Whole  Band  Analyzer  system 
(Millipore,  Ann  Arbor).  For  estimating  amounts  of  Vtg  RNA  visualized  on  gels,  RNA 
was  transcribed  from  the  Vtg  I  plasmid  pMMBl  and  the  Vtg  II  plasmid  pFhv2a,  using 
Ambion  reagents.  Transcribed  RNA  yields  were  measured  spectrophotometrically,  and 
diluted  to  a  concentration  of  66.7  pg//d.  For  RNA  standards,  133  pg  transcribed  RNA 
from  both  pMMBl  and  pFhv2a  was  loaded  onto  each  gel. 

Results 

The  complete  cDNA  sequence  (5166  bp)  of  a  Vtg  mRNA,  encoding  a  protein 
designated  as  Vtg  II  is  provided  in  Figure  3.2.  The  eight  overlapping  pGem-T  clones 
that  were  used  to  complete  the  sequence  are  represented  in  Figure  3.1.  A  ClustalV 
alignment  of  Vtgs  I  and  II  by  the  method  of  Swofford  et  al.  (1993)  revealed  45% 
sequence  identity  between  the  two  amino  acid  sequences  (Fig.  3.3). 

In  general,  the  two  sequences  share  the  same  profile  as  other  reported  Vtgs:  a 
large  lipovitellin  1  region  that  is  followed  by  a  polyserine  domain  (assumed  to  represent 
phosvitin)  that,  in  turn,  is  followed  by  a  lipovitellin  2  region  containing  a 
substantial  amount  of  conserved  cysteines.  Like  Vtg  I,  Vtg  II  contains  several  predicted 
N-glycosylation  (16),  phosphorylation  (45),  and  N-myristoylation  sites  (16),  agreeing 
with  our  expectations  for  a  lipophosphoglycoprotein.  The  smaller  length  of  the  Vtg  II 
a.  a.  sequence  (1687)  compared  to  that  of  Vtg  I  (1704)  can  be  primarily  attributed  to  gaps 
in  the  polyserine  domain.  A  graphical  comparison  of  the  polyserine  domains  of  Vtg 


Figure  3.3  ClustalV  alignment  of  F.  heteroclitus  Vtg  I  and  Vtg  II.  A 
polyserine  domain  defined  according  to  a  previously  published 
alignment  (LaFleur  et  al.  1995)  is  indicated  by  shaded  lettering. 
Identical  residues  are  denoted  by  asterisks.  Vtg  I  and  Vtg  II  share 
45%  overall  sequence  identity. 
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Vtg  I  MXAWLALTLAFVAGQH — PAPEI'AAGSCTYVXXXEAIII.GGZ.PEEGLABA  48 

Vtg   II  MRVLVLAI.TVALVAGIJQVS'.i  APEFAPGXTYEYXYEGYILGGLPEEGI^XA  50 


Vtg  I  GLXIS7XLLI.SAADQNTYMLXLVEPELSEYSGIWPXDPAVPATXLTAALH  98 

Vtg  II  GVXIQSXVLIGAAGPDSYILXLEDPVISGYSGIWPKEVFHPATXLTSALS  100 

Vtg  I  LSSQFPSSLNTPMVFVGXVFAPSEVSTI.VLNIYRGII.JIILQLNI3CXTHXV  148 

Vtg  II  AQLLTPVXFEYANGVIGXVFAPPGISTNVLNVFRGLLNMFQMNIXXTQNV  ISO 

•  ******  **    ***       **     **        *     *****  * 

Vtg  I  YDLQEVGTQGVCX7LYSISEDARIENILLTXTRDI.SNCQERLNXDIGIA-  197 

Vtg  II  YDLQETGVKGVCKTHYILHEDSKADRLHLTKTTDLNHCTDSIHMDVGMAG  200 

*****   •     *****   •       **  ****   **     *  *   *  * 

Vtg  I  rrEXCOKCaESTXNLUGTTTLSrVLXPVADAVMILKArVNELIQFSPFSE  247 

Vtg  II  Y7EXCAECMARGXTLSGAISVNYIMXPSASG7I.ILEATATEI.I.QYSPVNI  2S0 

•••««     *         *  *  *  *     **  *         **   *       **   *  ** 

Vtg  I  ANGAAQMRTKQSLSFLSISKEPIPSVKAEYRHRGSLXYEFSDELLQTPLQ  297 

Vtg  II  VNGAVQMEAXQTVTFVDIRXTPLSPLXADYIPRGSLXYELGTEFLQTPIQ  300 

•*«     **        **  *        *     *     *  **     *        *******  *     ****  * 

Vtg  I  LIXISDAPAQVAE'/LXHLATYNIEDVHENAPLXFLELVQLIJIIARYSDLS  347 

Vtg  II  LLRITNVEAQIVESLNNLVSLMMGHAHEDSPLXFIELIQLI.KVAXYSSIS  350 

*      *  **     »   *     *        *  *«     ****   **   ****   *   •»  * 

Vtg  I  MYWNQYXXMSPHRHWFLOTIPATGTFAGI.RFIXEXFMAESITIAEAAQAF  397 

Vtg  II  ,-iWSQFXTXIDHRHWLLSSIPAIGTHVALXFIXEXIVAGSVTAAEAAQAI  400 

*   *  *         ****   *     ***   **       *   *****      *   *   *  ****** 

Vtg  I  ITAVHMVTADPEVIXLFESLVDSDXWENPLUIEWFLGYGTMVNXYCMX  447 

Vtg  II  MSSTHLVKJJ3L2AIXLQEGL.AVTPNIRENAGLRELVMLGFGIMVHXYCVE  450 


Vtg  I  TVDCPVELIKPIQQRLSDAIAXNEEENIILYIXVLGNAGHPSSFXSLTXI  497 

Vtg  II  NPSCPSELVRPVHDIIAXAL2XRDNDEL5LALXVLGNAGHPSSLXPIMXL  500 


Vtg  I  MPIHGTAAVSI.PMTIHVEAIMAI.RNIAXXESRMVQELALQLYMDXALHPS  547 

Vtg  II  LPGFGSSASEL2LRVHIDATLALRXIGXREPKMIQDVALQLFMDRTL0PS  S50 

»  »  *  *  »  *  *  *  •      »      #      *  »      *  »  *  »  «      *  *  *      *  * 

Vtg  I  LRMLSCI'/LFETSPSMGLVTTVANSVKTSENLQVASFTYSHMKSLSRSPA  597 

Vtg  II  LRMVAVVVLFDTXLPMGLITTLAQSLLXEPNLQVLSFVYSYMXAFTXTTT  600 

*  *  *  »»*      #  *  W  »       r  *       »       »  #      #  *  »  »      *  W       r  *  *« 

Vtg  I  TIHPDVAAACSAAMKILG7XLDRLSLRYSXAVHVDLYNSSLAVGAAATAF  647 

Vtg  II  PQHSTVAAACIJVAIRILSPRF2RLSYRYSRAFHYDHYHNPWMLGAAASAF  650 

*       *****       *        **  *«*    ***    *    *    *    «  ** 

Vtg  I  YINDAATFMPXSFVAXTXGFIAGSTAEVLEIGANIEGLQELILXNPALSE  697 

Vtg  II  YINDAATVLPXNIMAXARVYLSGVSVDVLEFGARAEGVQEALIJCARDVPE  700 


Vtg  I                        STDRITXMXRVIXALSEWRSLPTSXPLASVYVXFFGQEIGFANIDXPMID  747 

Vtg  II                      SADRLTXMXQALXALTEWRANPSRQPLGSLr'/KVLGQDVAFANIDXEMVE  750 

*  **   **»*       ***   **»     *        **   *   •**     »*       ******  * 

Vtg  I                        XAVXFGXELPIQEYGREALXAI.LLSGINFHYAXPVLAAEMRRILPT'/AGI  797 

Vtg  II                      KIIEFATGPEIRTRGXXAL0ALL-5GYSMXYSXPMSAIEVRHIFPTSI.GI.  799 

*  *  *       *     **   •**   **  *   **     *   *   *   •    •  -  * 

Vtg  I                        PMELSLYSAAVAAASVEIXPNTSPRLSADFDVXTLLETDVELXAEIRPMV  847 

Vtg  II                      PMELSIiYTAAVTAASVEVQATISPPLPEDFHPAHLLXSDISMXASVTPSV  349 


Vtg  I  AMDTYAVMGLNTDIFQAALVARAXLHSWPAXIAARLtllXEGDFXLSAL?  897 

Vtg  II  SLHTYGVMGVNSPFIQASVLSRAXDHAAX.PKXMEARXDIVXGYFSYQFI.?  899 
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Vtg  : 
vta  :: 


Vtg  I 
Vtg  II 


VDVPBNITSMMVTTTAVRWIIESPI.VB8ITPLI.PTXVLVPIPIRRHTSKL 

VEGVX7IASARLETVAIARDVEGLAAAXVTPWP  YEPIVSXNATLNL 

a  .   •  «   »   »«      »  «.      »  »» 

DP7a  MSML3SSSL1PME  SEO VEP I PSI KFBRFAKfflf  CAKHIGV 

SQMSYYLNDSISASSELLPFSLQRQTGXNXI?  XPIVXXMCATTYTY 


947 
946 


990 
992 


Vtg  I 
Vtg  II 


Vtg  I 
Vtg  II 


GLXACFXFASQNGASIQDIVLYXIAGSHNFSFSVTPIEGEVVERI.SMEVX 
GIEGCVD  IWSRNATFI.RNTP  IYAI IGNHSLLVNVTPAAGPS IERIEIEVQ 
,       .         .   .  ...  »••     »  ..... 

7GJUCMSICLVKRIllMSD«rrSIG6PVI.VlCIJn^^SWi^^^PpS 
PGEQAASXILXSVYLNESSS'/LSDKNVLMKLXXILSPCrLXSSTKASSSSS 


1040 
1042 


1090 
1092 


Vtg  I 
Vtg  II 


vtg  I 
Vtg  II 


SSSSSRSSRSSSSSiSSSSSSRKiDLAARTSSSSSSSSRSSBSSSSSSSS 
GSSRSSRSRSSSSSSSSSSSS— — ---—SSRSSSSSSRSSSSLARSSX 

SSSSSSSSSS^RJ^SS&SSSSSSSSRSSSRVIISTIISSSSSSRTSSASS 
ML3IJUJPiMITSXRSS3SSSSSSSSSSSSS----~-SSSSSSX^Q^ 


1140 
1133 


1190 
1176 


Vtg  I 
Vtg  II 


lASFPSDSSSSSSSSORkSKKVMEXFQRLHiaUtVASGSSASSVSAIYKSX 
=-HNF  TKDHIHQH5VSK-5RLNSK  SSASSFSS IYNXT 


1240 
1211 


Vtg  I 
Vtg  II 


1KLGEE-- AWAVILRAVXADKRMVGYQLGFYL3XPNARVQIIVANISSD 
TYLSNIVSPVVTVLVRAIRADHKNQGVQIAVYYDKLTTRVQIIVAIlLrSO 


1289 
1261 


Vtg  I 
Vtg  II 


SNWRICADAVVLSKHXV7TXISWGEQCRXYSTNVTGETGIVSSSPAARLR 
DNWRICSOSMHLSHHKVMTRVrWGIGCXQyNTTIVAETGRVEKEPAVRVX 


1339 
1311 


Vtg  I 
Vtg  II 


VSWEPXPSTI.XRYGXMVNXYVP-VXIL3DLIHTXRENSTRNISVIAVATS  1388 
LAWARLPTYIRDYARRVSRYISRVAEDNGVNRTKVASXPKEIXLr/AVAN  1361 


Vtg  I 
Vtg  II 


EXTIDIITXTPMSSVYNVTMHLPMCIPIDEIXG-LSPFDEV-IDXIHFMV 
STSLriVTUlTPKNTFFXLGWVLPFYLPINNTAAELQAFQGRWMDQVTYML 


1436 
1411 


Vtg  I 
Vtg  II 


SXAAAAECSFVEDTLYTFNNRSYXNXMPSSCYQVAAQDCTDELXFMVT.I.R 
TXSAAAECTVVEDTVVTFNNRXYXTETPHSCHQVT^QDCTSEIXFIVLI.X 


I486 
1461 


Vtg  I                       KD-SSEQHHINVXISEIDIDMFPXDDNVTVXVNEMEIPPPACLTATQQL?  1535 

Vtg  II                     RDQTAERNEISIXISNIDVDMYPKDNAWVXVNGVEIPLTNLPYQHPTGN  IS 11 
.       .       »««.»..   ...     .  ....  «»» 

Vtg  I                        LXIXTXRRGLAVYAPSHGLQEVYFDRXTWRIXVADWMXGXTCGLCGXADG  1585 

Vtg  II                      IQIHQREEGISLHAPSHGLQEVFLSLNXVQVXVVDWMRGQTCGI.CGXADG  1561 


Vtg  I  EIRQEYHTPNGRVAXNSISFAHSWILPAESCRDASECRLXLESVQLEXQL  1635 

Vtg  II  EVRQEYSTPNERVSRNATSFAHSWVLPAXSCRDASECYMQLSSVXI.SXQI  1611 


Vtg  I  TIHGEDSTCFSVEPVPRCI,PGCI.?VXrrPVTVGFSCI^  SDPQT  1678 

Vtg  II  SLEGEESXCYSVEPVWRCLPGGAPVRTTSVTVGI.PCVSI.DSNLNRSDSI.S  1661 


Vtg  I  SVYDRSVDLRQTTQAHLACSCNTXCS  1704 

Vtg  II  SIYQXSVDVSETAESHLACRCTPQCA  Is37 


Figure  3.3--continued 
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I  and  II  (Fig.  3.4)  reveals  a  departure  from  a  trend  that  had  previously  been  noted 
concerning  serine  codon  usage  in  Vtg  I  (LaFleur  et  al.,  1995)  and  other  vertebrate  Vtgs 
(Byrne  et  al.,  1989)  Whereas  the  polyserine  domains  of  most  vertebrate  Vtgs  contain  a 
cluster  of  TCX  codons  at  the  5'  side  of  the  polyserine  coding  domain  and  a  cluster  of 
AGY  codons  at  the  3'  side,  the  Vtg  II  polyserine  domain  appears  to  have  these  codons 
equally  dispersed,  with  no  obvious  clustering. 

Northern  blot  analyses  showed  that  the  mRNA  of  Vtg  II  transcript  can  be  found 
in  both  estrogen-treated  males  and  spawning  females,  at  an  approximate  size  of  6.0  Kb 
(Fig  3.5).  By  analysis  of  duplicate  blots  with  separate  Vtg  I  and  Vtg  II  cDNA  probes, 
it  was  found  that  Vtg  II  transcripts  numbered  ten  times  less  than  those  of  Vtg  I.  Vtg  I 
probes  did  not  cross-hybridize  with  RNA  transcribed  from  Vtg  II  clones  and  vice  versa, 
confirming  that  two  separate  mRNAs  for  Vtg  I  and  Vtg  II  were  indicated  (Fig  3.5). 

The  N-terminal  amino  acid  sequence  of  a  69  kDa  protein  band  isolated  from  the 
yolk  protein  of  ovulated  eggs  was  determined  astobeNQVSYAPEFAPGxT 
Y,  where  "x"  was  undetermined  ("YP  69"  indicated  in  Chapter  4).  Allowing  the 
predicted  K  residue  in  the  unidentified  "x"  position,  this  sequence  provides  a  perfect 
match  for  the  N-terminus  of  Vtg  II  after  cleavage  of  the  predicted  signal  peptide  (Fig. 
3.2,  shaded  lettering)  and  indicates  that  Vtg  II  is  not  blocked  as  is  the  case  with  the  N- 
terminus  of  Vtg  I  (LaFleur  et  al.  1995).  These  data  verify  that  the  Vtg  II  protein  is  in 
fact  expressed,  transported,  and  incorporated  as  a  yolk  protein  precursor  in  oocytes  of 
F.  heteroclitus. 
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Polyserine  Domain  Number  of 

Ser  codons 
imuiiiHUumi     AGY  62 


vigil    rMn^MMtirHn m    rcx  2 

1  =  one  AGY  serine  codon        |  \ 

I  _  one  TCX  serine  codon      20  codons 


Figure  3.4  A  comparison  of  the  serine  codon  usage  in  the  polyserine  domains  (see 
Fig.  3.3)  of  F.  heteroclitus  Vtg  I  and  Vtg  II.  Whereas  the  TCX  and  AGY 
codons  of  the  Vtg  I  polyserine  domain  are  clustered  into  two  separate 
groups,  the  TCX  and  AGY  codons  of  Vtg  II  show  no  apparent  clustering. 
Only  serine  codons  are  shown,  with  relative  lengths  of  the  domains  drawn 
to  scale. 
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Discussion 

F.  heteroclitus  Vtg  II  cDNA  and  predicted  amino  acid  sequence  are  provided  in 
Figure  3.2.  Vtg  II  mRNA  is  present  in  the  liver  of  estrogen-treated  males  and  normal, 
spawning  females  (Fig.  3.5).  The  N-terminal  sequence  of  a  yolk  protein  isolated  from 
ovulated  eggs  was  found  to  be  identical  to  the  predicted  N-terminus  of  the  putative  Vtg 
II  translation  product.  Taken  together  these  data  indicate  that  the  yolk  proteins  of  F. 
heteroclitus  are  derived  from  a  mixture  of  at  least  two  estrogen-induced  liver  precursors, 
Vtg  I  and  Vtg  II,  establishing  the  existence  of  a  Vtg  gene  family  in  F.  heteroclitus. 
These  two  cDNAs  represent  the  first  two  Vtg  sequences  from  a  single  vertebrate  species 
to  have  been  completely  sequenced,  offering  a  unique  perspective  into  the  possible 
variance  between  Vtg  isoforms  occurring  in  single  species. 

Examination  of  the  alignment  of  Vtg  I  and  Vtg  II  reveals  typical  conservation  of 
lipovitellin  regions  seen  among  other  vertebrate  Vtgs.  As  previously  described  for  other 
vertebrate  Vtgs  (LaFleur  et  al.,  1995)  poor  alignment  occurred  in  the  polyserine 
domains.  Although  the  tandem  repeats  of  serine  can  be  aligned  in  small  stretches,  the 
overall  lengths  and  intervening  amino  acid  sequences  are  highly  variable,  resulting  in  a 
region  whose  conservation  is  difficult  to  interpret.  In  an  attempt  to  compare  and 
visualize  these  polyserine  domains,  hypothetical  boundaries  were  drawn  up  according  to 
those  used  in  a  previous  report  (LaFleur  et  al.,  1995),  and  a  graphical  representation  was 
created  showing  relative  domain  length  as  well  as  serine  codon  usage  (Fig.  3.4). 
Whereas  the  serine  codons  (TCX  and  AGY)  of  the  Vtg  I  polyserine  domain  appear  to  be 


Figure  3.5     Northern  blot  analysis  comparing  relative  expression  of  F. 
heteroclitus  Vtg  I  and  Vtg  II  mRNAs. 

A)  Methylene  blue  staining  of  duplicate  samples  transferred  to 
nylon  membranes  before  hybridization,  showing  equal  loading  of 
lanes,  as  indicated  by  28s  and  18s  rRNA  bands.  Lanes  a  and  a' 
contain  300  pg  Vtg  I  RNA  translated  from  plasmid  cDNA 
(pMMBl);  lanes  b  and  b'  contain  300  pg  Vtg  II  RNA  translated 
from  plasmid  cDNA  (pFhv2a);  lanes  c  and  c'  contain  15  /zg  total 
liver  RNA  from  an  estrogen-treated  male;  lanes  d  and  d'  contain 
15  total  liver  RNA  from  a  female  four  days  before  spawning; 
lanes  e  and  e'  contain  15  ng  total  liver  RNA  from  a  female  4  days 
after  spawning.  RNA  markers  (kb)  are  indicated  with  arrows  on 
the  left. 

B)  Autoradiography  of  the  membranes  shown  above,  indicating 
bands  hybridizing  to  Vtg  I  (left  side)  and  Vtg  II  (right  side)  DNA 
probes.  Note  that  the  Vtg  I  probe  did  not  hybridize  to  the  Vtg  II 
control  RNA  (lane  b)  and  Vtg  II  probe  did  not  hybridize  to  the  Vtg 
I  control  RNA  (lane  a'). 
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separated  into  two  general  clusters,  those  of  the  Vtg  II  poly  serine  domain  are 
randomly  interspersed.  This  arrangement  of  polyserine  codons  again  confirms 
the  observations  noted  by  Byrne  et  al.,  (1989)  that  the  polyserine,  or  phosvitin, 
domain  is  an  independently  evolving  domain  within  the  Vtg  gene,  showing  more 
variability  than  its  flanking  lipovitellin  regions. 

The  predicted  post-translational  modifications  of  Vtg  II  are  in  agreement 
with  expectations  for  a  lipophosphoglycoprotein.  Although  45  phosphorylation 
sites  may  appear  to  be  high,  we  expect  an  even  higher  amount  of  phosphorylation 
than  is  predicted.  Seven  of  the  45  predicted  phosphorylation  sites  occur  within 
the  polyserine  domain  (all  protein  kinase  C  sites),  however,  from  previously 
published  accounts,  it  is  likely  that  every  serine  residue  in  this  domain  is 
phosphorylated,  resulting  in  a  very  hydrophilic  domain  with  a  highly  negative 
charge.  Phosvitin  yolk  proteins  have  been  described  as  possessing  the  highest 
amount  of  phosphorylation  of  any  known  proteins.  Unfortunately,  the  hepatic 
Vtg  kinase  responsible  for  phosphorylating  the  extensive  polyserine  domains  of 
Vtg  has  not  yet  been  isolated  or  characterized,  so  that  an  algorithm  predicting  its 
target  sites  is  not  yet  available. 

Considering  the  ratio  of  expression  of  Vtg  I  and  Vtg  II,  our  data  suggest 
that  Vtg  I  is  the  major  yolk  protein  precursor.  Vtg  II  mRNA  is  present  in  the 
liver  of  spawning  females  at  ratio  of  1:10  with  respect  to  Vtg  I  RNA,  as 
evidenced  by  northern  blots.  In  SDS  PAGE  analysis,  YP  69,  which  was  mapped 
to  Vtg  II,  is  hardly  discernable  (not  shown  here)  when  compared  to  the  Vtg  I- 
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derived  yolk  proteins,  YP  125  and  YP  105  (Chapter  4),  agreeing  well  with  the 
mRNA  expression  data.  This  may  suggest  a  difference  in  the  interaction  between 
the  estrogen-estrogen  receptor  complex  with  the  estrogen  response  elements 
(ERE)  suspected  to  lie  upstream  of  the  Vtg  I  and  Vtg  II  coding  regions.  Isolation 
and  characterization  of  the  ERE  from  each  Vtg  gene  should  offer  valuable 
insights  into  ERE  mechanics.  Another  explanation  for  the  difference  in  amounts 
of  Vtg  transcript  may  involve  RNA  stabilization,  rather  than  gene  transcription. 
It  has  been  shown  that  the  half-life  of  Vtg  transcripts  increases  dramatically  in  the 
presence  of  estrogen  (Brock  and  Shapiro,  1983).  A  recent  report  suggests  that 
an  estrogen  inducible  protein  that  binds  to  the  3'  untranslated  region  of  Xenopus 
Vtg  may  be  responsible  for  this  stabilization  (Dodson  et  al.,1995).  It  would  be 
interesting  to  compare  protein-RNA  interactions  of  this  protein  with  two  closely 
related  Vtg  mRNAs,  F.  heteroclitus  Vtgs  I  and  II.  Furthermore,  our  estrogen- 
induced  liver  library  would  offer  an  excellent  template  to  screen  for  such  cDNAs 
that  might  code  for  this  protein. 

By  completing  the  sequences  of  two  Vtgs,  we  now  have  the  basic 
information  and  tools  to  molecularly  dissect  the  process  of  yolk  formation  in  F. 
heteroclitus.  We  may  begin  to  answer  questions  about  the  functional  significance 
of  possessing  multiple  Vtgs.  Antibodies  produced  against  non-conserved  regions 
of  the  two  Vtgs  may  indicate  differences  in  receptor  mediated  endocytosis, 
compartmentalization,  or  catabolism  by  the  embryo.  Cycling  controls  involving 
Vtg  may  also  be  studied;  for  instance,  Vtg  cDNA  probes  can  be  used  to  document 
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the  fine-tuned  expression  of  Vtg  that  must  occur  in  a  sequentially  spawning 
animal.  Besides  being  used  as  tools  to  specifically  investigate  F.  heteroclitus 
reproduction,  the  Vtg  I  and  II  cDNAs  represent  valuable  bio-markers  for  assaying 
the  reproductive  health  of  naturally  occurring  fish.  As  examples  of  mRNAs  and 
proteins  that  are  normally  induced  by  estrogens,  Vtg  I  and  II  will  be  particularly 
valuable  in  testing  for  the  estrogenic  effects  of  environmental  contaminants  such 
as  polychlorinated  biphenyls  (Bergeron  et  al.  1994;  Guillette  et  al.  1994). 


CHAPTER  4 
PRECURSOR-PRODUCT  RELATIONSHIP  OF 
VITELLOGENINS  I  AND  II  TO  THE  YOLK  PROTEINS 
OF  FUNDULUS  HETEROCL1TUS 

Introduction 

Current  views  concerning  the  origin  and  processing  of  yolk  proteins  in  oviparous 
vertebrates  were  formed  through  a  slow,  and  controversial  suite  of  biochemical  studies 
that  eventually  elucidated  two  unexpected  aspects  concerning  the  origin  of  yolk  proteins 
(reviews  by  Wallace  1978,  1985;  Eckelbarger  1994).  First,  it  was  shown  that  yolk 
proteins  originated  "hetero-synthetically"  in  the  liver,  rather  than  the  ovary.  Secondly, 
it  was  shown  that  yolk  proteins  were  not  synthesized  individually,  but  rather  as  a  large 
protein  precursor,  that  was  subsequently  processed  into  bona  fide  yolk  proteins.  This 
yolk  protein  precursor,  vitellogenin  (Vtg),  has  now  been  documented  to  appear  in  the 
blood  of  estrogen-treated  males  or  spawning  females  from  countless  oviparous  vertebrates 
(Wallace  and  Jared,  1969).  Additionally,  Vtg  has  been  documented  to  be  incorporated 
into  growing  oocytes  by  receptor-mediated  endocytosis  (Wallace  and  Jared,  1969b; 
Opresko  et  al.,  1980;  Stifani  et  al.,  1990;  Shen  et  al.,  1993;  Shibata  et  al.,  1993),  and 
processed  into  yolk  proteins  (Wallace  and  Jared,  1969).  Although  isotopic  and 
immunologic  tracking  studies  have  established  the  connection  between  yolk  proteins  and 
Vtg,  direct  sequence  data,  mapping  the  precursor-product  relationship  of  Vtg  to  the 
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derived  yolk  proteins,  have  been  scarce  (Clark,  1973;  Bergink  and  Wallace,  1974;  Byrne 
et  al.,  1984;  Gerber-Huber  et  al.,  1987;  Wallace  et  al.,  1990b;  Yamamura  et  al.,  1995) 
and  especially  lacking  from  teleosts  (Matsubaro  and  Sawano,  1995).  Recent  studies 
focusing  on  Vtg  genes  and  cDNAs  have  documented  that  many  animals  possess  multiple 
Vtg  genes  and  proteins  (Wahli  et  al.,  1979;  Blumenthal  et  al.,  1984;  review  by  Byrne 
et  al. ,  1989)  offering  an  even  more  challenging  puzzle  to  workers  seeking  to  map  these 
relationships. 

Obtaining  a  clear  synopsis  of  precursor-product  relationships  in  many  teleosts,  is 
further  complicated  by  the  extensive  yolk  protein  processing  that  occurs  in  teleost  yolk 
as  compared  to  the  yolk  of  tetrapods.  The  most  striking  difference  in  yolk  content 
documented  in  F.  heteroclitus  concerns  the  disappearance  of  a  125-kDa  yolk  protein  (YP 
125),  and  the  concomitant  appearance  of  smaller  yolk  protein  bands  immediately  prior 
to  oocyte  ovulation  (Wallace  and  Begovac,  1985;  Wallace  and  Selman,  1985;  Greeley 
et  al.,  1986).  This  enhanced  proteolytic  processing  may  be  connected  to  a  unique  pre- 
ovulatory process  that  occurs  in  some  teleost  oocytes,  termed  hydration.  Near  the  time 
of  germinal  vesicle  breakdown,  a  rapid  increase  in  oocyte  volume  occurs,  usually 
attributed  to  the  uptake  of  water  (Fulton  1898;  reviewed  in  Selman  and  Wallace,  1989). 
In  F.  heteroclitus,  a  substrate  spawner,  post-maturational  oocytes  possess  twice  the 
volume  of  pre-maturational  oocytes  (Wallace  and  Selman,  1985,  Greeley  et  al.,  1991; 
McPherson  et  al.),  but  in  the  oocytes  of  pelagic  spawners,  oocyte  volumes  can  increase 
over  four  times  the  original  volume,  in  as  little  as  twelve  hours  (Wallace  and  Selman, 
1981;  Watanabe  and  Kuo,  1986;  Craik  and  Harvey,  1987;  LaFleur  and  Thomas,  1991). 
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Several  possible  factors  have  been  hypothesized  to  drive  hydration,  ranging  from  the 
osmotic  balance  of  ions  (Hirose,  1976;  Watanabe  and  Kuo,  1986;  LaFleur  and  Thomas, 
1991;  Greeley  et  al.,  1991;  Wallace  et  al.,  1992),  ionic  balance  via  gap  junction  control 
(Cerda"  et  al.,  1993),  and  the  colligative  osmotic  contribution  of  cleavage  peptides  and 
free  amino  acids  (Oshiro  and  Hibiya,  1981;  Wallace  and  Selman,  1985;  Greeley  et  al., 
1987;  Thorsen  et  al.  1993).  With  these  issues  in  mind  we  sought  to  characterize  the 
precursor-product  relationship  between  Vtg  and  yolk  proteins  in  F.  heteroclitus,  with 
emphasis  on  the  processing  of  YP  125.  By  completing  the  cDNA  and  putative  protein 
sequences  of  two  F.  heteroclitus  Vtgs  (LaFleur  et  al.,  1996;  chapter  3),  we  obtained  the 
necessary  blueprint  for  comparison  of  microsequencing  data.  In  this  paper  we  document 
internal  and  N-terminal  amino  acid  sequences  from  seven  isolated  yolk  proteins,  all  of 
which  can  be  positioned  within  the  Vtg  I  and  Vtg  II  predicted  protein  sequences.  Our 
data  suggest  that  the  majority  of  yolk  proteins  are  derived  from  Vtg  I,  and  that  a  small 
amount  are  derived  from  Vtg  II.  Additionally,  we  suggest  that  the  rapid  processing  of 
YP  125  during  hydration  is  associated  with  the  presence  of  a  PEST  site  (Rogers  et  al. 
1986)  near  its  predicted  C-terminus. 

Materials  and  Methods 

Ovarian  follicles  were  dissected  from  the  ovaries  of  reproductively  active  F. 
heteroclitus.  Up  to  20  prematurational  follicles  or  up  to  10  ovulated  eggs  were  aliquoted 
into  a  1.5  ml  eppendorf  tube  containing  500-750  fil  of  sample  buffer  (0.1  M  Tris,  pH 
6.8,  2%  SDS,  64  Mm  dithiothreitol,  10%  glycerol)  on  ice.    The  follicles  were 
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immediately  ground  with  a  Kontes  pestle  and  heated  for  10  min  at  100  °C.  The 
homogenate  was  then  briefly  centrifuged  at  12,000  g  for  1  min.,  separating  the  dissolved 
yolk  from  insoluble  cellular  debris.  The  supernatant  was  aliquoted  to  a  fresh  tube  and 
stored  at  -20°C  until  electrophoresis.  Samples  were  diluted  again  by  as  much  as  1:50 
with  sample  buffer  before  loading  onto  gels. 

Sodium  dodecyl  sulphate-polyacrylamide  gel  electrophoresis  (SDS-PAGE)  was 
carried  out  according  to  Laemmli  (1970),  using  125  X  140  X  1.5  mm  slab  gels 
containing  a  3.5%  stacking  gel  overlaying  a  separating  gels  ranging  from  7%  for  larger 
YPs  to  12%  for  smaller  YPs,  with  modifications  based  on  the  protocol  of  Schagger  and 
von  Jagow  (1987)  using  Tris-tricine  running  buffers. 

Proteins  in  electrophorese  gels  were  electroblotted  onto  PVDF  membranes  in 
buffer  containing  10  Mm  MES,  Ph  6,  and  20%  methanol  at  20  V  overnight.  Protein 
bands  were  visualized  by  brief  staining  in  0.02%  Coomassie  blue  in  40%  methanol  plus 
5%  acetic  acid,  destained  in  40%  methanol  plus  5%  acetic  acid,  and  rinsed  in  distilled 
water.  Membranes  were  dried  and  stored  at  -20  °C  until  individual  bands  were  cut  out 
and  submitted  for  sequencing.  N-terminal  amino  acid  analyses  were  performed  on  PVDF 
bound  proteins  using  an  Applied  Biosystems  Model  473a  Sequencer  (LeGendre  and 
Matsudaira,  1988)  by  the  Protein  Chemistry  Core  Facility  of  the  University  of  Florida. 

The  two  largest  N-terminally  blocked  yolk  proteins  (YP  125  and  YP  105)  were 
again  electrophoresed,  blotted  onto  PVDF  and  subjected  to  in  situ  cleavage  (Scott  et  al., 
1988)  by  endoproteinase  LysC  (Endo  LysC)(0.003  units//xg  protein,  Promega),  in  50  mM 
Tris,  Ph  8.8,  0.2M  ammonium  bicarbonate,  and  0.1%  SDS,  0.1  Mm  EDTA.  Protein 
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Figure  4. 1  Major  yolk  proteins  isolated  from  oocytes  and  eggs  of  Fundulus 
heteroclitus.  The  major  yolk  proteins  shown  here  were  resolved  by  an 
SDS-PAGE  gradient  gel  (7% -20%)  enabling  the  resolution  of  a  wide 
range  of  proteins  ranging  from  125  kDa  to  20  kDa.  For  N-terminal 
sequencing,  however,  straight  gels  were  used  at  various  acrylamide 
concentrations  allowing  optimal  resolution  of  yolk  proteins  at  specific  size 
classes.  Yolk  proteins  that  were  isolated  for  N-terminal  sequencing  are 
indicated  with  our  designated  labels.  Note  YP  125  appears  as  a  robust 
band,  when  isolated  from  pre-maturational  oocytes,  but  is  hardly  visible 
in  yolk  isolated  from  ovulated  eggs.  (Photo  courtesy  of  R.  McPherson, 
Clarion  University) 


~5 


Mw,  kDa 


21.5; 


13  kDa-*-  fm  mmm  -«-13  kDa 

LFESLVDSDKW.  .  .  LFESLVDSDKW. 
YEFSDELLQTPL...  YEFSDELLQTPL . 

KYxAKHIGVGLK. . . 


Figure  4.2  Endo  LysC  digestion  products  of  YP  125  and  YP  105.  After  partial 
digestion  with  Endo  LysC,  polypeptide  fragments  were  electrobiotted  onto 
a  PVDF  membrane  and  silver  stained.  Positions  of  the  13  kDa  bands 
(presumed  to  be  identical)  from  each  digestion  are  indicated.  Molecular 
weight  standards  (kDa)  are  shown  on  the  left. 
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fragments  were  then  separated  by  Tris-tricine  gels,  blotted  to  PVDF,  and  visualized  by 
silver  staining  (Wray  et  al.,  1981).  Similar  bands  of  13  kDa  (presumed  to  be  identical) 
were  isolated  from  both  the  YP  125,  and  YP  105  digestion  and  once  again  submitted  to 
the  Protein  Chemistry  Core  for  N- terminal  amino  acid  sequencing. 

Sequencing  data,  including  Vtg  I  and  II  cDNAs,  along  with  microsequencing 
results  were  organized  using  the  PC/GENE  software  package  (Intelligenetics,  Mountain 
View,  CA).  Prediction  of  signal  peptides  was  carried  out  according  to  von  Heijne 
(1986).  PEST  sites  were  designated  according  to  the  algorithm  described  by  Rogers  et 
al.  (1986).  Other  Vtg  sequences  referred  to  in  this  paper  include  chicken  Vtg  II 
(gi:63887;  van  het  Schip  et  al.  1987),  Xenopus  laevis  Vtg  A2  (gi:  139636,  Gerber-Huber 
et  al.  1987),  lamprey,  Ichthyomyzon  unicuspus  Vtg  (gi:213312,  Sharrock  et.  al.  1992) 
and  sturgeon,  Acipenser  transmontanus  Vtg  (gi:437051,  Bidwell  and  Carlson,  1995). 

Results 

The  yolk  proteins  typically  found  in  F.  heteroclitus  oocytes  and  eggs  are 
demonstrated  in  Figure  4.1,  along  with  our  designations  of  certain  bands  according 
to  their  apparent  molecular  mass.  At  least  nine  yolk  proteins  were  resolved  by  Tris- 
tricine  SDS-PAGE,  and  these  were  blotted  onto  PVDF  membranes  and  submitted  for 
protein  sequencing  by  Edman  degradation.  Four  yolk  proteins  appeared  to  be  N- 
terminally  blocked,  while  five  yielded  N-terminal  sequences  (Table  4. 1). 

By  aligning  the  yolk  protein  N-terminal  sequences  against  the  predicted  amino 

acid  sequences  of  Vtg  I  and  Vtg  II,  we  successfully  mapped  the  five  sequenced  yolk 
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Table  4.1.  N-Terminal  Sequences  of  R  heteroc.'itus  Yolk  Proteins 


Protein 

Source 

N-Terminal  Seauence 

oourcs 

n 

Y? 

125 

Oocyte 

Blocked 

7 

3 

17 

105 

Oocyte/egg 

Blocked 

? 

2 

YP 

S3 

Egg 

Blocked 

? 

1 

Y? 

80 

Oocyte/egg 

Blocked 

7 

i 

i 

YP 

77 

Egg 

Blocked 

? 

1 

Y? 

69 

Egg 

NQVSY  APE PA  PGXTY  SYXYE 

Vtg  II 

1 

YP 

45 

Oocyte 

HKKMV  AxGxx  A 

Vtg  I 

2 

Y? 

39 

=39 

EEEAV  VAVIL  RAVKA  D 

Vtg  I 

2 

Y? 

29 

Oocyte 

AAAAE  xSFVE  DTLYT  FN 

Vtg  I 

1 

Y? 

20 

Oocyte 

EEDVE  PIPEY  KFRRF  AKKYC 

Vtg  I 

2 

ELC 

13 

YP  125 

YE PSD  ELLQT  PLQLI  KISD 

Vtg  I 

ELC 

13 

YP  125 

LPESL  VDSDK  WENP  LLREV 

Vtg  I 

ELC 

13 

YP  125 

KYCAK  HIGVG  LKACF  KFASQ 

Vtg  I 

ELC 

13 

YP  105 

YEFSD  ELLQT  PLQLI  KISD 

Vtg  I 

ELC 

13 

YP  105 

LPESL  VDSDK  WENP  LLREV 

Vtg  I 

*  Mapped  to  Vtg  I  (982-1001),  C-terminal  to  a  PEST  site 

ELC  denotes  N-terminus  of  products  cleaved  with  Endo  Lys  C  (.003  units/ug  protein) 
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protein  products  at  internal  positions  within  their  respective  precursors  (Fig.  4.2).  Of 
these  five  sequences,  the  most  notable  was  that  of  YP  69,  lining  up  to  the  N-terminus  of 
Vtg  II,  verifying  the  expression  of  this  secondary  Vtg  as  well  as  demonstrating  that  the 
signal  peptide  cleavage  site  had  been  correctly  predicted.  The  data  from  YP  69  also 
indicate  that  the  N-terminus  in  Vtg  II  is  unblocked  in  contrast  to  the  apparently  blocked 
N-terminus  of  Vtg  I. 

In  order  to  identify  the  origin  of  YP  105  and  YP  125,  the  protein  bands  were 
again  blotted  onto  PVDF  membranes  and  proteolytically  cleaved  with  Endo  Lys  C 
(0.003  units/^g  protein,  Promega)  in  50  Mm  Tris,  Ph  8.8,  0.2  M  ammonium 
bicarbonate,  and  0.1%  SDS,  0.1  Mm  EDTA.  The  digestion  products  were  again 
separated  by  Tris-tricine  gels,  and  visualized  by  silver  staining.  The  reaction  with  Endo 
LysC  was  confirmed  as  only  a  partial  digestion  by  the  isolation  of  peptide  products  larger 
than  those  predicted  if  cleavage  had  occurred  at  every  lysine  residue.  The  pattern  of 
electrophoresed  digestion  products  from  YP  125  and  YP  105  initially  appeared  to  be 
identical,  indicating  that  the  two  yolk  proteins  originated  from  the  same  precursor 
molecule  (Fig.  4.3).  However,  a  difference  between  the  digestion  products  was 
discovered  when  the  13-Kda  peptides  derived  from  YP  125  and  YP  105  were  sequenced. 
The  13-Kda  band  isolated  from  YP  105  digestion  contained  two  peptides,  mapping  near 
the  N-terminal  region  of  Vtg  I.  The  13-Kda  band  isolated  from  YP  125  contained  the 
exact  two  peptides  found  in  the  YP  105  digestion,  plus  a  third  peptide  (KYCAKH 
IGVGLKACFKFASQ),  that  mapped  much  further  along  the  Vtg  I  sequence, 
to  residue  982  (Figs.  4.4  and  4.5).  We  interpret  these  data  as  evidence  that  YP  105  and 
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Figure  4.3  A  graphical  representation  of  F.  heteroclitus  yolk  proteins  positioned 
along  the  length  of  the  Vtg  I  and  Vtg  n.  Length  and  positions  along  the 
Vtg  molecules  are  drawn  to  scale  according  to  alignments  of  N-terrnini 
data  with  cDNA  translations.  C-termini  of  yolk  proteins  were  calculated 
according  to  molecular  weight  estimations  and  should  be  regarded  as 
putative.  The  signal  peptides  and  polyserine  domains  as  predicted  from 
cDNA  translations  are  indicated. 
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Figure  4.4  A  graphic  representation  of  the  13  kDa  digestion  products  and  their 
positions  in  reference  to  YP  125,  YP  105  and  Vtg  I.  Note  that  the  third 
digestion  product  of  YP  125  lies  beyond  the  calculated  C-terminus  of  YP 
105.  The  indicated  PEST  site  was  found  in  YP  125,  but  is  truncated,  and 
thus  invalidated  in  YP  105. 
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YP  125  are  identical  Vtg  I-derived  yolk  proteins  except  for  a  short  20  Kda  extension  at 
the  C-terminus  of  YP  125  that  contains  the  third  Endo  LysC  digestion  product  (Fig.  4.3). 

The  C-terminus  of  YP  105  was  predicted  to  lie  at  (or  before)  residue  Ser  962  of 
Vtg  I,  using  the  estimated  mass  of  YP  105  and  the  masses  of  the  individual  residues 
predicted  from  the  Vtg  I  cDNA.  This  places  the  YP  105  C-terminus  only  2  residues 
away  from  the  N-terminal  residue  obtained  from  YP  20  (Glu  965)  suggesting  that  YP  105 
and  YP  20  result  from  cleavage  of  YP  125.  The  estimated  juncture  between  YP  105  and 
YP  20  lies  at  the  exact  midpoint  of  a  predicted  PEST  site  (residues  952-974,  receiving 
a  score  of  6.9,  where  5.0  and  above  is  considered  a  site).  This  purported  cleavage  site 
bisects  the  predicted  PEST  site,  leaving  the  two  resulting  protein  sequences  with  termini 
that  do  not  surpass  the  cutoff  value  for  valid  PEST  sites.  Thus,  although  YP  125 
contains  a  PEST  site,  neither  of  its  cleavage  products,  YP  105,  nor  YP  20  do. 

Discussion 

We  have  presented  precursor-product  relationships  to  account  for  the  origin  of 
seven  yolk  proteins  isolated  from  oocytes  and  eggs  of  F.  heteroclitus.  Likewise,  the 
sequences  determined  from  these  yolk  proteins  verify  the  expression,  transport,  and 
incorporation  of  both  the  yolk  protein  precursors  Vtg  I  and  Vtg  II,  whose  cDNA 
sequences  are  provided  in  Chapters  2  and  3,  respectively. 

We  had  initially  assumed  that  YP  125  and  YP  105,  the  major  bands  in  oocyte 
extracts,  were  derived  separately  from  Vtg  I  and  II,  but  the  internal  sequences  indicated 
that  both  yolk  proteins  originate  from  Vtg  I.  We  can  thus  surmise  that  Vtg  I  is  truly  the 


Figure  4.5  A  summary  of  the  precursor-product  relationship  of  Vtg  I  to 
derived  yolk  proteins.  The  entire  translated  amino  acid  sequence 
of  the  Vtg  I  cDNA  sequence  (LaFleur  et  al.,  1995)  is  presented, 
separated  into  sections  representing  yolk  proteins  as  indicated  by 
brackets  on  the  right.  N-terminal  sequences  of  isolated  yolk 
proteins  are  indicated  by  double  underlining.  Internal  sequences 
obtained  from  Endo  LysC  digestion  products  are  indicated  by 
shaded  lettering.  The  residues  of  the  PEST  site  are  represented  by 
bold  face  lettering.  The  predicted  poly  serine  domain  (no  N- 
terminal  sequencing  data)  is  shown  in  brackets. 
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major  yolk  protein  precursor  in  F.  heteroclitus.  Our  finding  that  most  of  the  yolk  protein 
is  derived  from  Vtg  I  agrees  with  northern  blot  analyses  that  suggested  ten  times  more 
Vtg  I  than  Vtg  II  message  is  present  in  total  liver  RNA  (Chapter  3;  LaFleur  et  al., 
1996). 

A  major  factor  that  prevents  construction  of  a  definitive  map  accounting  for  all 
yolk  proteins  derived  from  the  Vtgs  is  the  difficulty  in  isolating  and  microsequencing 
peptides  derived  from  the  phosvitin  domain  (Wallace  and  Begovac,  1986;  Wallace  et  al. 
1990).  Although  we  expect  that  the  poly  serine  repeats  represented  in  both  Vtg  I  and  II 
cDNAs  are  processed  into  true  phosvitins,  we  have  been  unable  to  verify  this  by  N- 
terminal  sequencing.  The  highly  negative  charge  of  phosvitin  prevents  it  from  staining 
with  Coomassie  blue,  as  well  as  adhering  to  PVDF  membranes  for  sequencing.  Because 
phosvitin  can  be  visualized  using  Stains-all,  it  has  been  documented  as  a  single  25-30 
kDa  band  in  prematurational  oocytes,  with  at  least  four  smaller  phosvitin-like  bands 
(phosvettes)  appearing  in  preparations  from  ovulated  eggs  (Wallace  and  Begovac,  1985). 
We  estimate  that  the  C-terminus  of  YP  20  (and  presumably,  the  C-terminus  of  YP  125) 
lies  adjacent  to  the  N-terminus  of  phosvitin,  as  predicted  by  the  position  of  the  Vtg  I 
cDNA  polyserine  repeating  region.  Likewise,  the  sequence  obtained  from  YP  45, 
sharing  identity  with  residues  1220-1230  of  Vtg  I,  most  likely  abuts  the  C-terminal 
cleavage  site  of  phosvitin. 

As  previously  mentioned,  one  of  the  most  pronounced  changes  observed  to  occur 
in  F.  heteroclitus  yolk  proteins  is  the  disappearance  of  YP  125  during  the  transformation 
of  oocytes  to  mature,  ovulated  eggs  (Fig.  4. 1).  A  possible  explanation  for  this  rapid  and 
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rather  selective  proteolysis  is  the  occurrence  of  a  PEST  site  within  the  C-terminal  tail  of 
YP  125.  The  apparently  longer-lived  YP  105  is  identical  to  YP  125  except  for  lacking 
the  C-terminal  tail  where  the  PEST  site  occurs.  PEST  sites  were  initially  defined  as  a 
conserved  clustering  of  amino  acids  that  was  observed  to  occur  in  proteins  known  to  be 
rapidly  degraded.  Common  to  all  PEST  site  are  high  local  concentrations  of  Pro,  Glu, 
Ser,  and  Thr,  and  to  a  lesser  extent  Asp.  Of  the  other  five  vertebrate  Vtg  sequences 
contained  in  Genbank,  chicken  Vtg  II  (residues  1058-1080  and  931-951)  and  lamprey  Vtg 
(residues  1161-1182  and  1360-1393)  contain  two  PEST  sites,  while  Xenopus  Vtg  A2 
contains  a  sequence  (residues  953-969)  with  a  score  (4.71)  very  close  to  the  cutoff  value 
of  5.  The  lack  of  proteolysis  during  oocyte  maturation  in  such  animals  may  indicate 
either  the  absence  of  an  appropriate  proteolytic  mechanism  or  the  inaccessibility  of  the 
cleavage  sites  in  the  granular  yolk  of  these  animals  (Wallace,  1985).  The  Vtg  of 
sturgeon,  a  chondrostean  fish,  does  not  contain  a  PEST  site. 

The  proteolytic  processing  of  YP  125  has  been  implicated  as  part  of  the  hydration 
mechanism  of  F.  heteroclitus  oocytes,  with  the  generated  small  peptides  and  free  amino 
acids  providing  the  osmotic  potential  to  drive  an  uptake  of  water  into  the  oocyte  (Wallace 
and  Begovac,  1985;  Wallace  and  Selman,  1985).  More  recent  data  suggest  that 
hydration  in  F.  heteroclitus  is  primarily  due  to  K+  fluxes  via  the  gap  junctions  between 
oocytes  and  follicle  cells  (Wallace  et  al.,  1992;  Cerdd  et  al.,  1993),  but  the  possibility 
of  some  contributions  to  hydration  resulting  from  yolk  cleavage  has  not  yet  been 
abandoned.  So  far,  complete  Vtg  sequences  have  been  reported  from  no  other  teleosts 
besides  F.  heteroclitus.    However,  as  more  sequences  are  completed,  it  will  be 
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interesting  to  see  whether  PEST  sites  are  found  in  other  teleostean  Vtgs,  especially  those 
of  pelagic  spawners  in  which  both  oocyte  hydration  and  yolk  proteolysis  are  especially 
pronounced. 


CHAPTER  5 

FUNDULUS  HETEROCLITUS  CHORIOGENINS:  LIVER-DERIVED 
COMPONENTS  OF  THE  VITELLINE  ENVELOPE  AND  CHORION 
SHARING  SEQUENCE  IDENTITY  WITH  MAMMALIAN  ZP  PROTEINS 

Introduction 

The  spawned  eggs  of  the  estuarine  teleost  Fundulus  heteroclitus  are  exposed  to 
quite  a  different  environment  than  the  ovulated  eggs  of  mammals.  Whereas  mammalian 
eggs  are  protected  from  infection,  desiccation,  and  predation  by  the  safe  surroundings 
of  the  uterus,  F.  heteroclitus  eggs  are  released  and  fertilized  during  the  tumultuous  spring 
tides,  and  deposited  into  empty  mussel  shells  or  onto  the  leaves  of  marsh  grass,  where 
they  remain  actually  stranded  above  the  water  line  for  fourteen  days  until  the  embryos 
emerge  by  hatching  during  the  next  spring  tide  (Taylor  et  al.,  1977;  Hsiao  et  al.,  1994). 
Though  exposed  to  extremely  different  environments,  both  of  these  vertebrate  eggs  are 
protected  by  a  quasi-similar  layer  of  extracellular  matrix  (ECM).  In  mammals  this 
translucent  layer  of  ECM  is  termed  the  zona  pellucida  (ZP),  but  in  fish  and  many  other 
invertebrates  it  is  often  referred  to  as  the  vitelline  envelope  or  chorion. 

In  this  paper  we  adhere  to  the  definitions  of  Dumont  and  Brummett  (1980) 
regarding  the  vitelline  envelope  and  chorion.  They  stated  that  the  term  "vitelline 
envelope"  referred  to  the  highly  structured  acellular  layer  that  appears  and  encloses  the 
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teleost  oocyte  during  its  development,  while  the  term  "chorion"  referred  to  the 
structurally  and  perhaps  chemically  transformed  vitelline  envelope  that  surrounds  the 
ovulated  egg,  separates  from  the  egg  at  the  time  of  fertilization,  and  encloses  the  embryo 
until  hatching.  Implicit  in  these  definitions  is  the  assumption  that  the  proteinaceous 
structure  of  the  vitelline  envelope  comprises  a  substantial  component  of  the  chorion. 

The  structure  of  the  teleostean  vitelline  envelope  has  been  well  documented  in 
several  cyprinodont  species  (Yamamoto,  1963;  Fliigel,  1967;  Dumont  and  Brummett, 
1980)  as  well  as  in  many  other  teleosts  (reviewed  by  Dumont  and  Brummett,  1985; 
Selman  and  Wallace,  1989).  Early  biochemical  characterizations  of  the  vitelline  envelope 
and  chorion  concentrated  on  the  formation  of  the  vitelline  envelope  during  oocyte 
development  (Chaudry,  1956;  Yamamoto,  1963;  Flegler,  1977;  Tesoriero,  1977),  as  well 
as  the  breakdown  of  the  chorion  by  the  proteolytic  enzymes  of  the  hatching  embryo 
(Yamamoto  and  Yamagami,  1975;  Kaighn,  1964,  Hagenmaier,  1985).  In  earlier  works 
it  had  been  assumed,  but  not  proven  that  the  major  vitelline  envelope  proteins  (VEPs) 
were  synthesized  by  the  ovarian  follicle  -  the  site  of  synthesis  residing  in  either  the 
oocyte  or  surrounding  follicle  cells  (Anderson,  1967).  More  recent  investigations 
targeting  VEP  synthesis  include  studies  by  Tesoriero  (1978)  using  [3H]proline 
incorporation,  and  by  Begovac  and  Wallace  (1989)  in  which  incorporation  of 
[35S]  methionine  combined  with  immunohistochemistry  provided  evidence  that  at  least  one 
of  the  VEPs  from  the  pipefish,  Syngnathus  scovelli,  originated  from  within  the  ovarian 
follicle. 

A  new  direction  towards  understanding  vitelline  envelope  formation  was  launched 
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by  research  concentrating  on  the  chemistry  of  hatching  enzymes  in  the  medaka,  Oryzias 
latipes.  When  polyclonal  antibodies  directed  against  protein  fragments  of  the  lysed 
chorion  were  used  as  probes  on  medaka  tissues,  Hamazaki  et  al.  (1984)  found  that  tissues 
other  than  the  ovary  were  recognized  by  the  antibody.  By  1989  Hamazaki  et  al.  (1989a) 
had  isolated  an  estrogen-induced  glycoprotein  from  the  liver  that  could  be  localized  to 
the  inner  layer  of  the  vitelline  envelope.  Since  then  additional  reports  have  verified  these 
findings  in  several  other  fish  (Hyllner  et  al.,  1991;  Murata  et  al.,  1991;  Oppen-Berntsen 
et  al.,  1992a,  1992b;  Larsson  et  al.,  1994).  Additionally,  Hyllner  et  al.  (1991)  showed 
that  the  synthesis  of  VEPs  could  be  induced  by  estrogen  treatment  in  males  of  the 
rainbow  trout  (Oncorhynchus  mykiss),  brown  trout  (Salmo  trutta),  and  turbot 
(Scophthalmus  maximus),  providing  convincing  evidence  that  the  major  VEPs  in  these 
species  could  be  synthesized  without  any  contribution  by  the  ovary. 

So  far,  only  two  nucleotide  and  protein  sequences  representing  piscine  VEPs  have 
been  published.  Lyons  et  al.  (1993)  reported  a  gene  sequence  (wf)  from  the  flounder, 
Pseudopleuronectes  americanus,  that  they  described  as  a  "teleostean  homolog  of  a 
mammalian  ZP  gene."  The  predicted  amino  acid  sequence  contained  a  novel  PQQ 
repeating  region  near  the  N-terminus,  resembling  a  motif  found  in  extracellular  matrix 
proteins.  Murata  et  al.  (1995)  reported  a  cDNA  sequence  (L-SF)  from  medaka  that  also 
shared  identity  with  mammalian  ZP  proteins.  The  predicted  amino  acid  sequence  of  the 
medaka  L-SF  protein  shared  more  identity  with  mouse  ZP3  (37.9%;  Ringuette  et  al. 
1988)  than  it  did  with  the  flounder  ZP  sequence  (18%)  previously  mentioned,  suggesting 
the  presence  of  at  least  two  distinct  groups  of  teleost  ZP  homologs. 
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In  this  paper  we  present  the  predicted  primary  structure  of  three  proteins  that 
share  identity  with  mammalian  ZP  proteins.  Furthermore,  we  have  isolated  cDNAs 
encoding  these  sequences  from  a  liver  library  rather  than  an  ovarian  library,  followed  by 
northern  analyses  revealing  liver  rather  than  ovarian  transcripts.  Lastly,  the  amino  acid 
compositions  predicted  from  our  cDNAs  are  similar  to  the  composition  of  VEPs  isolated 
from  F.  heteroclitus  follicles.  Therefore,  we  conclude  that  the  proteins  encoded  by  these 
cDNAs  are  synthesized  by  the  liver,  transported  to  the  ovary,  and  incorporated  into  the 
vitelline  envelope.  We  further  suggest  that  as  major  constituents  of  the  vitelline 
envelope,  these  proteins  eventually  contribute  to  the  structure  of  the  hardened  chorion, 
where  they  remain  until  finally  degraded  by  embryonic  hatching  enzymes.  We  designate 
these  cDNAs  and  the  proteins  that  they  encode  as  "choriogenins"  (Chgs)  to  emphasize 
their  role  as  proteins  of  the  vitelline  envelope  and  chorion,  yet  to  underscore  their  site 
of  synthesis  as  being  extra-ovarian,  and  thus  different  from  that  of  the  mammalian  ZP 
proteins.  Although  the  teleostean  chorion  and  mammalian  zona  pellucida  have  different 
appearances,  functions,  and,  as  this  study  verifies,  origins  of  synthesis,  we  provide 
evidence  that  the  constituent  molecules  appear  to  have  evolved  from  a  set  of  common 
ancestral  proteins 

Materials  and  Methods 

Reagents 

Estradiol- 17/3  was  obtained  from  Sigma  Chemical  Co.  (St.  Louis,  MO). 
Radioisotopes,  [a-32P]dCTP  and  [«-35S]dATP,  were  purchased  from  New  England 
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Nuclear  (Boston,  MA).  Lambda  gtlO  vector  and  cDNA  synthesis  reagents  were  obtained 
from  Promega  (Madison,  WI).  The  subcloning  plasmid  pGem-T  was  a  product  of 
Promega.  All  "ROW"  oligonucleotide  primers  were  synthesized  by  the  University  of 
Florida  Interdisciplinary  Center  for  Biotechnology  Research  (ICBR)  oligonucleotide  core 
facility,  while  primers  labelled  "GL"  were  synthesized  by  Bio-Synthesis  (Freindswood, 
TX).  Sequenase  version  2.0  DNA  polymerase  and  dideoxy  sequencing  reagents  were 
obtained  from  US  Biochemicals  (Cleveland,  OH).  In-house  sequencing  gels  were  cast 
using  Sequagel-8  (National  Diagnostics,  Atlanta)  polyacrylamide  reagents.  Some  cDNA 
sequences,  especially  through  repeating  regions  or  when  verifications  were  needed,  were 
performed  by  The  University  of  Florida  ICBR  DNA  Sequencing  Core.  Amplification 
reactions  were  performed  using  a  1:50  mixture  of  cloned  pfu  DNA  polymerase  and 
Thermophilus  aquaticus  DNA  polymerase  (Stratagene  and  Promega,  respectively). 
Reagents  for  random-primed  labeling  of  probes  were  purchased  from  Pharmacia 
(Piscataway,  NJ).  Magna  nylon  and  PVDF  transfer  membranes  were  obtained  from  MSI 
(Westboro,  MA)  and  Millipore  Corp.  (Bedford,  MA),  respectively. 

Cloning  Strategy 

A  liver  cDNA  library  was  constructed  from  poly  A+-RNA  pooled  from  five  F. 
heteroclitus  males  that  had  been  treated  with  two  injections  of  estradiol- 17/3,  as 
previously  described  (LaFleur  et  al.,  1995).  While  screening  the  Xgt  10  library  for  Vtg 
cDNAs  using  anchored  PCR,  we  isolated  several  non-target  cDNAs.  Three  of  these  non- 
Vtg  cDNAs  were  revealed  by  BLAST  analysis  to  code  for  protein  sequences  that 
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pChgla 


ROW  45 


Chg  500 


ROW  52 


pChglb 


Chg  427 


PChg2a  R°W55 


ROW  65 


pChg2b 


pChg3a 


ROW  45 


Chg  553 


GL1 


200  bp 


pChg3b 


Figure  5.1  Strategy  for  cloning  Chg  500,  427  and  553  cDNAs.  Boxes  indicate 
relative  sizes  of  contiguous  cDNA  sequences  coding  for  Chgs  500,  427, 
and  553.  Thin  black  lines  represent  individual  cDNA  isolates  obtained  by 
anchored  PCR  or  RACE  and  inserted  into  pGem-T.  Arrows  indicate 
gene-specific  primers  that  were  used  in  initial  amplifications  of  individual 
clones.  The  legend  indicates  relative  length  of  200  bp. 
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resembled  mammalian  ZP  proteins.  The  clones  containing  these  initial  cDNAs  were  used 
as  probes  and  to  design  primers  that  would  target  additional  cDNAs  in  order  to  complete 
the  Chg  coding  regions  (Fig.  5.1). 

The  first  Chg  cDNA,  pChgla,  was  isolated  by  anchored  PCR  with  a  Vtg  II- 
targeted  reverse  primer,  ROW  45,  and  the  XgtlO  vector  primer,  NEB  1231.  To  retrieve 
a  cDNA  containing  the  poly- A  tail,  primer  ROW  52  was  designed  from  the 
3'  side  of  pChgla.    An  overlapping  clone  (pChglb),  containing  a  poly-A  tail,  was 
isolated  by  anchored  PCR,  completing  the  translated  region  of  Chg  500. 

The  second  choriogenin  cDNA,  pChg2a,  was  isolated  by  anchored  PCR  with 
anchor  primer  NEB  1232  and  reverse  primer  ROW  55,  also  designed  to  target  Vtg  II 
sequence.  Blast  analysis  on  the  sequence  of  pChg2a  revealed  that  it  shared  67%  identity 
with  the  medaka  L-SF  protein  (Murata  et  al.  1995).  The  primer  ROW  65  was  then 
designed  from  the  3'  region  of  pChg2a  to  retrieve  an  overlapping  cDNA  that  contained 
a  poly-A  tail.  The  sequence  from  the  resulting  clone  (pChg2b)  completed  the  translated 
portion  of  the  second  Chg  427. 

The  third  Chg  clone,  (pChg3a),  was  also  isolated  with  ROW  45,  along  with  the 
initial  pChgla,  but  it  remained  unrecognized  as  a  novel  clone  until  further  review  of  the 
sequences.  An  overlapping  clone  (pChg3b)  containing  the  poly-A  tail  was  isolated  by 
anchored  PCR  with  the  forward  primer  GL2.  A  third  clone  containing  a  short  segment 
5'  to  pChg3a  including  the  initial  methionine  codon  was  isolated  using  a  rapid 
amplification  of  cDNA  ends  protocol  (RACE)  (Frohman,  1992)  with  reverse  primer 
GL1 ,  and  3  fig  total  liver  RNA  as  template. 
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Sequence  Analyses 

Nucleotide  sequencing  data  was  organized  and  assembled  using  the  sequence 
analysis  software  package  PC/GENE  (Intelligenetics,  Mountain  View,  CA).  A  search 
for  post-translational  modifications  and  signature  sequences  was  done  with  the  Prosite 
program  (Bairoch  et  al.,  1995)  available  from  the  world  wide  web  ExPaSy  molecular 
biology  server  (http://expasy.hcuge.ch/www/expasy.top.html).  Protein  alignments  were 
performed  with  the  ClustalV  program  (Higgins  et  al.  1992),  utilizing  a  Pam  250  matrix 
with  fixed  gap  and  floating  gap  penalties  =  10.  In  order  to  compare  Chg  sequences  with 
a  large  number  of  ZP  Genbank  entries  a  preliminary  ClustalV  alignment  containing 
complete  sequences  was  performed.  Whereas  the  N-  and  C-termini  from  Chgs  differ 
greatly  with  those  of  mammalian  ZP  proteins,  a  core  region  of  conserved  sequence  was 
observed  where  all  three  Chgs,  as  well  as  all  other  reported  ZPs  could  be  aligned  with 
a  minimum  number  of  gaps  when  anchored  to  five  strictly  conserved  cysteines.  This 
region  has  previously  been  defined  by  Bork  and  Sander  (1992)  as  the  "ZP  domain"  and 
is  included  in  the  Prosite  program  (Bairoch  et  al.,  1995)  available  on  the  ExPaSy 
molecular  biology  server.  For  parsimonious  tree  analysis,  a  new  ClustalV  alignment  was 
performed  including  only  the  ZP  domains  from  each  entry,  providing  a  well  conserved 
region  on  which  to  base  our  distance  analysis.  Parsimonious  tree  analysis  was  done  by 
importing  a  ClustalV  alignment  in  phylip  3.4  format  into  the  PAUP  3.1  program 
(Swofford,  1993)  available  from  the  Center  for  Biodiversity  (Champagne  IL).  The 
unrooted  tree  presented  in  Figure  5.8.  resulted  from  running  100  bootstrap  replicates  of 
a  heuristic  search.  All  entries  used  in  alignment  and  tree  analysis  were  retrieved  from 
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the  Entrez  document  retrieval  system,  Release  20.0,  available  from  NCBI  (NIH, 
Bethesda,  MD).  Sequences  referred  to  in  this  paper  include  the  flounder 
Pseudopleuronectes  americanus  ZP  or  wf  (gi:  425355;  Lyons  et  al.,  1993);  medaka,  O. 
latipes  L-SF  (gi:  563774;  Murata  et  al.,  1995);  goldfish,  Carrasius  auratus  ZP3 
(gi:763073;  unpublished);  carp,  Cyprinus  carpio  ZP3i  (gi:763078;  unpublished)  and 
ZP3ii  (gi:763080;  unpublished);  mouse,  Mus  musculus  ZP1  (gi:  972946;  Epifano  et  al., 
1995),  ZP2  (gi:  202460;  Liang  et  al.,  1990),  and  ZP3  (gi:  141726;  Ringuette  et  al., 
1988);  human  ZP3A  (gi:  141724;  Chamberlin  and  Dean,  1990),  ZPB  (gi:  458279;  Harris 
et  al.,  1994),  and  ZP2  (gi:  466206;  Liang  and  Dean,  1993);  cat,  Felis  cams  ZPA 
(gi:458269),  ZPB  (gi:458271),  and  ZPC  (gi:  458273;  Harris  et  al.,  1994). 

Northern  Blot  Analyses 

Male  and  female  F.  heteroclitus  were  collected  from  the  estuarine  creeks  adjacent 
to  the  Whitney  Laboratory,  and  were  maintained  in  running  seawater  tanks  under 
14L:10D  photoperiod  conditions  at  25  +  2°C.  After  approximately  two  weeks  in 
captivity,  fish  began  spawning  in  laboratory  tanks  on  a  14-day  cycle  (Hsiao  et  al. ,  1994). 
By  monitoring  amounts  of  eggs  spawned  each  day,  we  were  able  to  calculate  the  14-day 
cycle  of  separate  tanks  and  thus  predict  what  phase  of  the  spawning  cycle  individual  fish 
were  in  before  sacrifice  (Hsiao  et  al.,  1996).  In  this  paper  two  northern  blots  were 
performed  using  a  female  fish  that  was  predicted  to  be  in  a  pre-maturational  phase,  four 
days  prior  to  spawning.  Fish  were  maintained  for  at  least  one  month  before  being  used 
for  RNA  collections. 
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Experimental  groups  of  fish  were  subjected  to  two  intraperitoneal  injections  of 
estradiol- 17/3  (0.01  mg/g  body  weight)  dissolved  in  50  /d  coconut  oil  (Kanungo  et  al., 
1990).  Control  groups  were  sham-injected  with  coconut  oil  alone.  The  first 
injection  was  performed  on  day  1,  the  second  injection  on  day  4,  followed  by  sacrifice 
and  liver  dissection  on  day  8. 

Total  RNA  was  isolated  from  livers  and  ovaries  by  extraction  with  RNA  Stat-60 
reagents  (Tel-Test  "B",  Inc.  Friendswood,  TX).  Tissues  were  dissected  and  immediately 
frozen  in  1.5-ml  tubes  containing  500  /d  of  RNA  Stat-60  emulsion,  by  immersion  in 
liquid  nitrogen.  Tissues  were  homogenized  at  20°C  using  a  Kontes  pestle  and  motor. 
Typically,  a  300  mg  liver  yielded  0.35  mg  total  RNA,  with  O.D.  260/280  ratios 
consistently  above  1.8.  Total  RNA  samples  were  resuspended  in  DEP-C-treated  water 
and  stored  at  -80°C  until  used  in  analyses. 

Before  electrophoresis,  aliquots  of  15  /xg  total  RNA  were  precipitated  in 
isopropanol  and  denatured  in  2.2  M  formaldehyde,  50%  formamide,  50  mM  MOPS  (pH 
7.0)  for  30  min  at  65°C.  Samples  were  electrophoresed  through  gels  containing  2.0% 
agarose,  0.6  M  formaldehyde,  50  mM  MOPS,  and  1  mM  EDTA  for  1.5  hours  at  3.5 
V/cm  gel  in  50  mM  MOPS,  1  mM  EDTA  running  buffer.  RNA  was  blotted  onto  Magna 
nylon  membranes  by  capillary  action  with  20  X  SSC,  immobilized  by  U.V.  crosslinking 
and  visualized  by  staining  briefly  with  methylene  blue.  All  hybridizations  were  carried 
out  at  65°C  in  1  X  Denhardt's  solution,  6  X  SSC,  and  0.1%  SDS  without  formamide. 
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Isolation  and  Partial  Characterization  of  the  Major  VEPs 

VEPs  were  isolated  following  the  protocols  of  Oppen-Bernsten  et  al.  (1990)  and 
Hyllner  et  al.  (1991)  with  slight  modifications.  Ovarian  follicles  were  dissected 
from  the  ovary  of  a  reproductively  active  F.  heteroclitus.  Up  to  30  individual  unovulated 
follicles  were  placed  in  a  1.5  ml  Eppendorf  tube  containing  an  ice-cold  solution  of  0.1 
M  EDTA  and  0.5  M  NaCl.  The  follicles  were  gently  ground  with  a  Kontes  pestle.  The 
intact  vitelline  envelopes  were  collected  by  a  low  speed  spin  (150  g)  for  5  min,  and  the 
supernatant  containing  mainly  yolk  was  discarded.  The  insoluble  vitelline  envelope 
material  was  washed  over  24  hours  with  at  least  five  changes  of  ice-cold  0.5  M  NaCl 
followed  by  five  changes  of  Milli  Q  water,  each  time  collecting  the  material  by 
centrifugation  at  150  g.  VEPs  were  solubilized  in  a  Tris-buffered  extraction  buffer  (0.1 
M  Tris-HCl,  pH  8.8;  2%  SDS;  0.3  M  2-mercaptoethanol;  0.1  M  EGTA  by  heating  to 
70°C  for  30  min. 

For  electrophoresis  of  VEPs,  samples  were  diluted  at  least  1:4  in  sample  buffer 
(0.06  M  Tris-HCl,  pH  6.8;  2%  SDS;  0.3  M  2-mercaptoethanol;  10%  glycerol;  without 
bromophenol  blue)  and  heated  to  95°C  for  5  min.  Sodium  dodecyl  sulphate- 
polyacrylamide  gel  electrophoresis  (SDS-PAGE)  was  carried  out  according  to  Laemmli 
(1970)  using  125  mm  X  110  mm  X  1.5  mm  slab  gels  containing  a  3.5%  stacking  gel 
overlaying  a  10%  w/v  separating  gel,  with  modifications  based  on  the  protocol  of 
Schagger  and  von  Jagow  (1987),  using  Tris-tricine  running  buffers. 

Initial  attempts  to  transfer  VEPs  onto  membranes  using  buffers  containing  0.01 
M  morpholinoethane  sulphonic  acid  (MES)  and  20%  methanol  failed,  probably  due  to 
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the  insolubility  of  the  VEPs.  Successful  transfer  of  the  VEPs  was  accomplished  in  0.01 
M  MES,  10%  methanol,  and  .01%  SDS.  After  transfer,  the  PVDF  membrane  was 
stained  with  0.02%  Coomassie  blue  in  40%  methanol  and  5%  acetic  acid  to  indicate 
protein  bands.  Duplicate  membrane  transfers  containing  VEP  69,  60,  and  46  were 
submitted  for  amino  acid  composition  analysis  and  N-terminal  amino  acid  analysis  using 
an  Applied  Biosystems  Model  473a  Sequencer  (LeGendre  and  Matsudaira,  1988)  at  the 
Protein  Chemistry  Core  Facility  of  the  University  of  Florida  Interdisciplinary  Center  for 
Biotechnology  Research. 

The  N-terminal  sequences  initially  obtained  from  the  three  bands  consisted  of 
overlapping  and  weak  signals  that  were  only  five  residues  long  (data  not  shown); 
therefore  the  VEP  69,  60,  and  46  were  isolated  again  and  subjected  to  in-gel  digestion 
with  endoproteinase  Lys  C  (0.003  units//*g  protein,  Promega),  in  10  mM  Tris,  pH  8.8, 
0.2M  ammonium  bicarbonate,  and  0.1%  SDS,  0.1  mM  EDTA.  Protein  fragments  were 
separated  by  Tris-tricine  gels,  blotted  onto  PVDF  and  visualized  by  silver  staining.  Two 
of  the  best  resolved  bands  from  each  digestion  were  once  again  submitted  to  the  Protein 
Chemistry  Core  for  N-terminal  amino  acid  sequencing. 

Results 

Choriogenin  cDNA  Sequences 

The  nucleotide  and  translated  amino  acid  sequences  from  three  estrogen-induced 
liver  cDNAs  are  presented  in  Figure  5.2.  We  have  designated  the  cDNAs  and  predicted 
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protein  products  as  Chg  500,  Chg  427  and  Chg  553,  according  to  the  number  of  residues 
in  the  predicted  amino  acid  sequence. 

The  cDNA  encoding  Chg  500  is  1641  bp  long,  including  a  1500  bp  open  reading 
frame  (Fig.  5.2a).  The  calculated  molecular  weight  after  subtracting  the  weight  of  a 
predicted  signal  peptide  (residue  1-22)  is  53,125.  The  most  notable  region  of  the 
predicted  primary  structure  is  a  proline-rich  repeating  domain  near  the  N-terminus, 
including  five  repeats  of  (PQQ  PQQ  PQY  PSK).  Other  proteins  that  share  sequence 
identity  with  Chg  500  include  the  flounder  ZP  gene  product  (58%;  Lyons  et  al.,  1993), 
which  also  contains  a  proline-rich  repeating  domain  (Figs.  5.3a  and  5.3b),  and  several 
mammalian  ZP  proteins,  including  mouse  ZP1  (32%;  Epifano  et  al.,  1995),  cat  ZPB 
(35%;  Harris  et  al.,  1994),  and  human  ZPB  (34%;  Harris  et  al.,  1994). 

The  Chg  427  is  encoded  by  a  cDNA  of  1751  bp  (Fig.  5.2b).  Subtracting  the 
weight  of  the  predicted  signal  peptide  (residues  1-24)  leaves  a  calculated  molecular 
weight  of  44,892.  This  protein  sequence  does  not  include  a  substantial  repeating  domain, 
although  residues  28-46  (PGK  PSK  PQS  PPT  QNQ  QQL  Q)  are  reminiscent  of  the 
proline-rich  repeat  previously  described  for  Chg  500.  The  predicted  N-terminus 
possesses  three  in-frame  methionine  codons,  but  the  first  codon  agrees  best  with  the 
context  and  positional  environment  for  initiation  of  translation  as  described  by  Kozak 
(1991).  Alignment  analyses  revealed  that  Chg  427  shares  highest  identity  (67%)  with  a 
medaka  female-specific  protein  termed  "L-SF"  (Murata  et  al.,  1995)  (Fig.  5.4).  The 
next  highest  identity  comes  from  sequences  recently  deposited  for  two  cyprinid  fishes 
(42%  from  C.  auratus  ZP3,  43%  from  C.  Carpio  ZP3i,  and  44%  with  C.  carpio  ZP3ii). 


Figure  5.2     Nucleotide  and  conceptually  translated  amino  acid  sequences  of  F. 
heteroclitus  choriogenins. 

A)  The  Chg  500  cDNA  (1641  bp)  codes  for  a  500  amino  acid 
protein  sequence,  containing  a  predicted  signal  peptide  (von 
Heijne,  1986)  from  residues  1-22,  indicated  by  shading.  The  poly- 
adenylation  signal  is  indicated  by  underlining  (beginning  at 
nucleotide  1607). 

B)  The  Chg  427  cDNA  (1672  bp)  codes  for  a  427  amino  acid 
protein  sequence,  containing  a  predicted  signal  peptide  from 
residues  1-22  indicated  by  shading.  A  poly-adenylation  signal  is 
represented  by  underlining  (beginning  at  nucleotide  1637). 

C)  The  Chg  553  cDNA  (1816  bp)  codes  for  a  553  amino  acid 
protein  sequence,  containing  a  predicted  signal  peptide  from 
residues  1-25  indicated  by  shading.  A  poly-adenylation  signal  is 
represented  by  underlining  (beginning  at  nucleotide  1767). 


actaactagaccagacagcttcgaggt  27 


ATGGCAAGTCACTGGAGTCTCACCCGTTGGGCCGCGCTCGCTCTGCTATGCTGCTTAGCTGGGAAA  9  3 
MASHWSVTRWAALALLCCLAGK22 

GGAGCAGAGGCTCAGAAGGGTTCGTATCCTCCGCAACCTCAAAAGCCTTCGTACCCTCAGAATCCT  159 
GAEAQKGSYPPQPQKPSYPQNP44 

CAAACGCCTTCGTATCCTCAGCAACCTCAAAAGCCTTCGTACCCTCAG AATCCTCAAACGCCTTCG  225 
QTPSYPQQPQ.KPSYPQNPQTPS66 

TACCCTCAGTATCCTCAAACACCTTCAAACCCTCAGCAACCTCAGTATCCTCAAACACCTTCAAAC  291 
YPQYPQTPSNPQQPQ.YPQTPSN88 

CCTCAGT ATCCTCAAACGCCTTCGTACCCTCAG AATCCTCAAACGCCTTCGTACCCTCAGAATCCT  357 
PQYPO.TPSYPQNPQTPSYPQNP  110 

CAAACGCCTTCGTACCCTCAGAACCCTCAGCAACCTCAATTGTCGTGGGATTTTTCAAAGCCTACA  423 
QTPSYPQNPQQPQLSWDFSKPT  132 

AAACCTCAATATCCTAAGCCCCAAAGGCCTCCATCAAAACCTCAATATCCTAGGCCCCAAACGCCT  489 
KPQYPKPQRPPSKPQYPRPQTP  154 

CCTTCAAAACCTCAATATCCTAGGCCTCAAACGCCCCAACAACCTGGAAAAAAACAATGGGATGAT  555 
PSKPQYrRPQTPQQPGKKQWDD  176 

ACAAAGACTCCGAATGTCCCTTCCAAGAGACCAGAGGCCCCTGGAGTTCCCACCCCTAAAAGTTGT  621 
TKTPNVPSKRPEAPGVPTPKSC  198 

GACGTGGAAGTAGCTTCAAGAGTCCCCTGTGGAGCTTCTGCCGTCTCTGCTACTGAATGTGAGGCC  687 
DVEVASRVPCGASAVSATECEA  220 

AGAGACTGTTGCTTTGATGGCCAGTCATGCTACTTTGCAAAAGGAGTGACAGTCCAGTGTACCAAG   7  53 
RDCCFDGQSCYFAKGVTVQCTK  242 

GATGGCCATTTrATCGTTGTlGTGGCCAAAGATGTCACCCTGCCACACATTGACCTTGAAACAATC  819 
DGHFIVVVAKDVTLPHIDLETI  264 

TCATTG7TGGGAGGAGGTCAAGGCTGTACACATGTTGACCCCAATTCACTTTTTGCCATCTACTAC  885 
SLLGGGQGCTHVDPNSLFAIYY  286 

TTTCCCGTTACTGCTTGTGGGACTGTTGTCATGGAGGAGCCTGGCGTTATAATGTATGAGAATCGG  951 
FPVTACGTVVMEEPGVIMYENR  309 

ATGACC7CCTCATATGAAGTAGGAGTTGGGCCTCTTGGAGCCATTACCAGGGACAGCACCTACGAA  1017 
MTSSYEVGVGPLGAITRDSTYE  330 

TTGCTCTTCCAGTGTAGGTACATTGGCACCTCAGTTGAAACTTTGGTGGTCGAAGTGCTGCCATTA  1083 
LLFQCRYIGTSVETLVVEVLPL  352 

GACAATCCTCCTCCAGCAGTTGCTGAGCTCGGACCGATCAGAGTGGCCCTTAGGTTGGCCAATGGC  1149 
DNPPPAVAELGPIRVALRLANG  374 

CAGTGTGCTACAAAGGGTTGCAACGAAGCGGAGGTAGCCTACACCTCCTACTATTTGGACTCAGAC  1215 
QCATKGCNEAEVAYTSYYLDSD  396 

TA7CCGATTACCAAGATACTGAGGGATCCCGTGTATGTGGAGGTTCAGCTCCTTGAAAAGACAGAT  1281 
YPITKILRDPVYVEVQLLEKTD  418 

CCCGCTCTGGTTCTGACTCTTGGACGTTGTTGGGCAACCACTAGCCCCAATCCTCACAGCTTGCCC  1347 
PALVLTLGRCWATTSPNPHSLP  440 

CAGTGGGACATTCTGATTGACGGATGTCCCTACACGGATGATCGTTACCTCTCCACACTGGTTCCA  1413 
QWD      ILIDGCPYTDDRYLSTLVP  462 

GTGGACGCCTCTTCTGGTCTGCAATTTCCAAGTCACTACCGGCGTTTCACTTTCAAAATGTTTACC  1479 
VDASSGLQFPSHYRRFTFKMFT  484 

TTTGTGGACACCACTGCAATGGACCCCCTGAGGGAAAATGTGTACATTCACTGTAGCACAGCTGTG  1545 
FVDTTAMDPLRENVYIHCSTAV  506 


TGCGTGCCAGGACAGGGTGTCAGCTGCGAACCATCATGCAACAGAAAAGGAAAGAGAGACACTGAG  1611 
CVPGQGVSCEPSCNRKGKRDTE  528 

GCTGC AG AGCAGAGGAAGGTCGAACCAAAGGTTGTGGTTTCGTCCGG AG AAGTGATCATGACCGCT  1677 
AAEQRKVEPKVVVSSGEVIMTA  550 

CCTCAGGAGTAAtctgggacaagctcaggaattcatctgggaacatttagacaaaactctttgaaa  1743 
P     Q     E     -  553 


atcaacaacqttqttqaacaqtaaataaaaatqtcaccctaaqtaaaaaaaaaaaaaaaaaaaaaa  1809 
aaaaaaa  1816 
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gaacttttcagatcacttgtgtttgtgaagcc  32 

ATGATGATGAAGTGGACTGTCTTTTGCGTTGTGGCGCTGGCTTTGCTTGGCAGCTTCTGTGATGCT  98 
H     M     «     K     W     T     V     F     C     V     V     A     L     A     L     L     G     S     P C     D     A  22 

CAGGGGTACGCGAAACCTGGTAAGCCATCAAAACCCCAATCACCACCTACGCAAAACCAACAGCAA  164 
QGYAKPGKPSKPQSPPTQNQQQ44 

TTGCAGACATTTGAGAAAGAGCTCACCTGGAAGTACCCCGACGATCCCCAGCCAGACCCCAAGCCT  230 
LQTFEKELTWKYPDDPQPDPKP66 

AATGTGCCATTTGAGTTGAGATACCCTGTTCCTGCTGCAACCGTTGCTGTTGAGTGCAGAGAGAGC  296 
NVPFELRYPVPAATVAVECRES  88 

ATAGCTCACGTGGAGGTCAAGAAAGACATGTTTGGCACCGGCCAGCCGATCAATCCAAATGACCTC  362 
IAHVEVKKDMFGTGQPINPNDL  110 

ACCCTGGGTAACTGTGCGCCTGTTGGAGAGGATAGTGCCGCTCAAGTGTTGATTTATGAAGCTGAA  428 
TLGNCAPVGEDSAAQVLIYEAE  132 

CTGCATCAATGCGGAAGCCAGCTGATGATGACAAATGATGCTCTCGTCTACACCTTCGTTTTGAAC  494 
LHQCGSQLMMTNDALVYTFVLN  154 

TATAACCCTACGCCTTTGGGATCGGTTCCTGTTGTGAGAACCTCCCAAGCTGCTGTGATCGTGGAA  560 
YNPTPLGSVPVVRTSQAAVIVE  176 

TGCCACTACCCAAGGAAGCACAATGTGAGCAGCCTTCCTCTGGATCCCCTTTGGGTCCCATTCTCT  626 
CHYPRKHNVSSLPLDPLWVPFS  198 

GCAGTTAAGATGGCTGAGGAGTTCCTGTACTTCACTATGAAACTCATGACTGATGACTGGATGTAC  692 
AVKMAEEFLYFTMKLMTDDWMY  220 

CAGAGGCCAAGCTACCAGTATTTCCTGGGAGACCTGATCCGTATAGAGGTTACTGTCAAGCAATAC  758 
QRPSYQYFLGDLIRIEVTVKQY  242 

TTCCATGTACCCCTGCGTGTTTACGTGGACAGATGTGTGGCAACCCTCTCTCCTGATGTAACCTCA  824 
FHVPLRVYVDRCVATLSPDVTS  264 

AGCCCCAACTATGCCTTCATTGATAACTTTGGGTGTTTGATTGACGCCAGAATCACAGGCTCTGAC  890 
SPNYAFIDNFGCLIDARITGSD  286 

TCAAAGTTCATGGCTCGCACCCAAGAGAACCACCTTCAGTTCCAGCTGGAGGCCTTCAGGTTCCAG  1956 
SKFMARTQENHLQFQLEAFRFQ  309 

AATTCTGACAGTGGAGTGATCTACATCACCTGCTACTTGAAGGCAACGTCTACTAGCCAGGCCATA  1022 
NSDSGVIYITCYLKATSTSQAI  330 

GACAGCCAGCACAGAGCTTGTTCCTACACTGGCGGATGGAGGGAGGCCAGTGGAGTTGATGGAGCT  1088 
DSQHRACSYTGGWREASGVDGA  352 

TGTGGTTCTTGTGAGACCAACGTGACGCCGTACACCGCTCCAGCAGTTACATTCGCTTCACCACCT  1154 
CGSCETNVTPYTAPAVTFASPP  374 

GTCGTTGT7ACTGATGGTGGTGGAGTAACGCTTCCAGCTCCAGGCAGTCCAAAAGTCCCTTATAAT  1220 
VVVTDGGGVTLPAPGSPKVPYN  396 

CCGAGGAAAGTCCGTGACGTCACCCAAGCCGAAATTTTGGAATGGG AAGGCGTTGTCTCTCTGGGC  1286 
PRKVRDVTQAEILEWEGVVSLG  418 

CCCATCCCCATCATGGAGAAGAAACTCTGAaaaacagaagtgtaacatgatattccgccgtagcca  1352 
PIPIMEKKL-  427 

tgaacaccataataaaaagtatcattggttcatatcgctgtctatgttatgcctatgtctcatggt  1418 
agattttcttaaacaagtaacaaacccccacttagtctcttaaatctgcttaaaattttaaatatt  1484 
gacaaatttccaaaaaattgtagaggtctttttttaggggggagggataaatgaaggaaaacttgt  1550 
cttagattcccttttatgtaatggtaaggcagtgtgtggacccccatgtgtccagcaccataatct  1616 
gtaaccctccttttcatgaaaataaaattcgcaactataaaaaaaaaaaaaaaaaa  1672 


Figure  5.2-continued 
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C) 


actaactagaccagacagcttcgaggt  27 

ATGGCAAGTCACTCGAGTGTCACCCGTTGGGCCGCGCTGGCTCTGCTATGCTGCTTACCTGGGAAA  93 
H     ASHWSVTR     WAAl.ALt.CCt.AGK.  22 

GGAGCAGAGGCTCAGAAGGGTTCGTATCCTCCGCAACCTCAAAAGCCTTCGTACCCTCAGAATCCT  159 
G     A     E     A  QKGSYPPQPQKPSYPQNP44 

CAAACGCCTTCGTATCCTCAGCAACCTCAAAAGCCTTCGTACCCTCAGAATCCTCAAACGCCTTCG  22  5 
QTPSYPQQPQKPSYPQNPQTPS66 

TACCCTCAGTATCCTCAAACACCTTCAAACCCTCAGCAACCTCAGTATCCTCAAACACCTTCAAAC  291 
YPQYPQTPSNPQQPQYPQTPSN88 

CCTCAGTATCCTCAAACGCCTTCGTACCCTCAG AATCCTCAAACGCCTTCOTACCCTCAGAATCCT  357 
PQYPQTPSYPQNPQTPSYPQNP  110 

CAAACGCCTTCGTACCCTCAGAACCCTCAGCAACCTCAATTGTCGTGGGATTTTTCAAAGCCTACA  423 
QTPSYPQNPQQPQLSWDFSKPT  132 

AAACCTCAATATCCTAAGCCCCAAAGGCCTCCATCAAAACCTCAATATCCTAGGCCCCAAACGCCT  489 
KPQYPKPQ.RPPSKPg.YPRPQ.TP  154 

CCTTC AAAACCTCAATATCCTAGGCCTCAAACGCCCCAACAACCTGG AAAAAAACAATGGGATGAT  555 
PSKPQYPRPQTPQQPGKKQWDD  176 

ACAAAGACTCCGAATGTCCCTTCCAAGAGACCAG AGGCCCCTGGAGTTCCCACCCCTAAAAGTTGT  621 
TKTPNVPSKRPSAPGVPTPKSC  198 

GACGTGGAAGTAGCTTCAAGAGTCCCCTGTGGAGCTTCTGCCGTCTCTGCTACTGAATGTGAGGCC  687 
DVEVASRVPCGASAVSATECEA  220 

AGAGACTGTTGCTTTGATGGCCAGTCATGCTACTTTGCAAAAGGAGTGACAGTCCAGTGTACCAAG  753 
RDCCFDGQSCYFAKGVTVQCTK  242 

GATGGCCATTTTATCGTTGTTGTGGCCAAAGATGTCACCCTGCCACACATTGACCTTGAAACAATC  819 
DGHFIVVVAKDVTLPHIDLETI  264 

TCATTGTT JGGAGGAGGTCAAGGCTGTACACATGTTGACCCCAATTCACTTTTTGCCATCTACTAC  885 
SLi.GGGQGCTHVDPMSLFAIYY  286 

TTTCCCGTTACTGCTTGTGGGACTGTTGTCATGGAGGAGCCTGGCGTTATAATGTATGAGAATCGG  951 
FPVTACGTVVMEEPGVIMYENR  309 

ATG ACCTCCTCATATG AAGTACGAGTTGGGCCTCTTGG AGCCATTACCAGGG ACAGCACCTACG AA  1017 
MTSSYEVGVGPLGAITRDSTYE  330 

TTGCTCTTCCAGTGTAGGTACATTGGCACCTCAGTTGAAACTTTGGTGGTCGAAGTGCTGCCATTA  1083 
LLFQCRY     IGTSVETLVVEVLPL  352 

GACAATCCTCCTCCAGCAGTTGCTGAGCTCGGACCGATCAGAGTGGCCCTTAGGTTGGCCAATGGC  1149 
DNPPPAVAELGPIRVALRLANG  374 

CAGTGTGCTACAAAGGGTTGCAACGAAGCGGAGGTAGCCTACACCTCCTACTATTTGGACTCAGAC  1215 
QCATKGCNEAEVAYTSYYLDSD  396 

TATCCGATTACCAAGATACTGAGGGATCCCGTGTATGTGGAGGTTCAGCTCCTTGAAAAGACAGAT  1281 
YPITKILRDPVYVEVQLLEKTD  418 

CCCGCTCTGGTTCTGACTCTTGG ACGTTGTTGGGCAACCACTAGCCCCAATCCTCACAGCTTGCCC  1347 
PALVLTLGRCWATTSPNPHSLP  440 

CAGTGGGACATTCTGATTGACGGATGTCCCTACACGGATGATCGTTACCTCTCCACACTGGTTCCA  1413 
QWDILIDGCPYTDDRYLSTLVP  462 

GTGGACGCCTCTTCTGGTCTGCAATTTCCAAGTCACTACCGGCGTTTCACTTTCAAAATGTTTACC  1479 
VDASSGLQFPSHYRRFTFKMFT  484 

TTTGTGGACACCACTGCAATGG ACCCCCTGAGGG AAAATGTGTACATTCACTGTAGCACAGCTGTG  1545 
FVDTTAMDPLRENVYIHCSTAV  506 

TGCGTGCCAGGACAGGGTGTCAGCTGCGAACCATCATGCAACAGAAAAGGAAAGAGAGACACTGAG  1611 
CVPGQGVSCEPSCMRKGKRDTE  528 

GCTGCAGAGCAGAGGAAGGTCGAACCAAAGGTTGTGGTTTCGTCCGGAGAAGTGATCATGACCGCT  1677 
AAEQRKVEPKVVVSSGEVIHTA  550 

CCTCAGGAGTAAtctgggacaagctcaggaattcatctgggaacatttagacaaaactctttgaaa  1743 
P    Q    E    -  5S3 

atcaacaaggttgttgaacagtaaataaaaatgtcaccctaagtaaaaaaaaaaaaaaaaaaaaaa  1809 
aaaaaaa  1816 
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Mammalian  ZP  proteins  sharing  high  identity  with  Chg  427  include  mouse  ZP3  (30%; 
Ringuette  et  al.,  1988),  cat  ZPC  (32%;  Harris  et  al.,  1994)  and  human  ZP3A  (32%; 
Chamberlin  and  Dean,  1990)(Fig.  5.4).  Chg  427  shares  little  identity  (18%)  with 
Chgs500  and  Chg  553  (not  shown).  A  Prosite  scan  predicted  only  one  N-glycosylation 
site  from  residue  184-187  (Fig.  5.2b). 

Chg  553  was  translated  from  a  1817-bp  cDNA  (Fig.  5.2c).  Subtraction  of  a 
predicted  signal  peptide  (residue  1-26)  resulted  in  a  calculated  molecular  weight  of 
58,290.  Chg  553  is  62%  identical  to  Chg  500,  and  likewise  shares  identity  with  the 
flounder  ZP  (52%)  (Fig.  5.3a),  mouse  ZP1  (30%;  Epifano  et  al.,  1995),  cat  ZPB  (29%, 
Harris  et  al.,  1995),  and  human  ZPB  (28%;  Harris  et  al.,  1994).  The  N-terminal  region 
of  Chg  553  contains  a  proline-rich  repeating  domain  that  differs  from  that  of  Chg  500 
by  containing  only  half  as  many  glutamine  residues. 

A  ClustalV  alignment  (not  shown)  containing  the  ZP  domains  (Bork  and  Sander, 
1992)  of  seventeen  reported  sequences,  including  the  three  Chg  sequences,  and  five  other 
reported  sequences  from  fish,  plus  three  mouse,  three  cat,  and  three  human  ZPs,  was 
used  in  parsimony  analysis.  The  shortest  tree  resulting  from  a  heuristic  search  with  100 
bootstrap  replicates  is  presented  in  Figure  5.8.  The  resulting  unrooted  tree  was  drawn 
according  to  the  format  of  the  Fitch  analysis  program  in  order  to  emphasize  relatedness 
among  sequences  rather  than  a  deduced  ancestral  relationship.  Bootstrap  values  are 
indicated  adjacent  to  the  appropriate  nodes.  The  results  of  the  analysis  suggest  that  three 
major  groups  of  ZP  proteins  can  be  described,  each  one  containing  a  separate  set  of 
mouse,  cat,  and  human  ZP  sequences.  In  this  paper  we  refer  to  these  groups  according 


Figure  5.3     Alignments  of  Chg  500,  Chg  553  and  the  flounder  ZP  protein. 

A)  A  ClustalV  (Higgins  et  al.  1992)  alignment  including  predicted 
amino  acid  sequences  of  Chg  500,  Chg  553,  and  a  flounder  ZP 
protein  (Lyons  et  al.  1992).  A  conserved  core  region  sharing 
sequence  identity  with  other  ZP  proteins  and  designated  as  a  "ZP 
domain"  (Bork  and  Sander,  1992)  is  denoted  by  dark  line.  This 
is  the  core  domain  used  for  drawing  the  tree  shown  in  Figure  5.8. 

B)  A  ClustalV  alignment  modified  by  eye  of  the  proline-glutamine 
rich  repeating  region  from  Chg  500  and  the  flounder  ZP  protein. 
A  Pro-Gln-X  triplet  is  strongly  conserved  throughout  the  region. 
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A, 


B) 


Chg    500  1   MT   -  MKLX  YCCLL  AVA I  HG  YLVG —  AQPGXPQYPSKPQ  

Chg    553  1    MASHWSVTRWAA-LALL-CCLAGKGAEA  QKGS YP PQPQKPS YPQ NPQTP S YP 

?lounderZP  1  MAKRWS ANSLVAQ VVLIYLVVfTNVEVLGSRRRSRS SSSGSB I Z VQQTGHYH? AGKGQRYV 

Chg   500  3  5    -QPQQPQ  YPSKPQQPQQPQYPSKPQQPQQPQ'.'PQQPQflPQQPQ  YPSXPQYPSKPQQP 

Chg   553  51  QQPQKPSYPQNPQTPS  YPQ  YPQ  TPS  NPQQPQ.'PQT  PS  NPQYPQT  PS  YPQNPQT  PS  YPQNP 

FlounderZP  6 1   QQRRRLHHDFSFQNPG  AEPPQTPQQ.PTYPQ.QPQ.QPQ.Q.PQ.QPKYPQQPQ  QPQQP 

Chg   500  91   QQPQ  YPSXPQQPQ  QPQYPQKPQQPQQPQYPQ  KSQTPTE-- 

Chg    553  111  aTPSYPQNPQQPQLSWDFSKPTKPQYPKPQRPPSKPQYPRPQTPPSKPQYPRPQXPQQPG 

Flound«rZP  114    QQPKYPQQPQQPQ  QPQQPKYPQQPQQPQQPQQPKYPQQPQQPKNPQPKNPQPPQ 

Chg    500  129   TFHTCDVPAPFRXQCGAPTIStJTBCSAINCCJDGRMC 

Chg    553  171  XKQWDDTKTPNVPSKRPEAPGVPTPKSCDVEVASRVPCSASAVSATBCBARDCCFDGQSC 

FloundarZP  IS  8    PQKNPQPTXQQVSDDRI  FCGVDP YLRIQCOVDDXTAABCEALXCCFEGYQq 

Chg   5  00  16  6  Y.YGKSVTI«QCTXDGQFXIVVARDATLPHXDLSSXSIiIiGGQPNCGPVGTT3AFAXYQ,F'2|AD 

Chg   553  231  YFAKGVTVQCTXDGHFTVVVAKDVTLPHXDI»STXSEEGGGQGCTHVDPN3LFAX^gYFPiVT 

71ound«rZP  219  FFOKAVTVQCTKDAQFVVVVAKDAXLPNLI  INTX3IiQazaQQGTAVDSNSBFAXFa»pivi, 

Chg   500  22  6  CC3TIMTSSPOVXXYSHRMA33YSVAVGPYOAITRDS(JYSI,FV(lCaTXGTSXSALVXa.V- 

Chg    553  29  1  ACGTVVMSSPGVXMYBMRMTSSYSVGVGFLGAXTXDSTYSLLFQCRYXGTSVBTLVVBVI, 

?lound«rZP  27  9  ACQS VVTEB POT XXYSNRMTSSYBVDVGSNGVXTRDSF7BLQFQCRYTGL3XBTVVISIL 

Chg   500  285  GLLPPPPGVAAPGPIjRVELRLGiIGBCSVRGCTBEQVAYT3YYTDADYPVTXXLRDPVYVB 

Chg   55  3  351  PLDNPPPAVABLGPIRVALRlANaQCATKaCNBAEVAYTSTTCLDSDYBiTlCXLRDPVYVB 

FloundarZP  339  PSNTPPRPVAAI*GP.IRVQLRI»GNGEC2TXGCNBVEAAYTSYTTEGDYPVTXVLRDPyYyK; 

Chg    500  345  VRI iBRTDPN: VLTLGRCWATASPF ?Q SLPQWDLLXNGCPYSDCRYRTNilPVas 33GEL 

Chg   553  411  VQLLBKTDPALVLTEGRCWATTSPNPHSLPQWDILIDGCPTTDDRTLSTLVPVDASSGIiQ 

FloundarZP  39  9  VRLIiBKRDPMLVt,TLORCWVTN3PNPHHQPQWDLLXDGCPYADDRY:SSLVPVG?S3GVN 

Chg    500  405  FPTHYRRFVFXMFTFVSGGGGASDATKKTPSDPSWNPLHSKVYXHCDAAVCQPSMTSISCS 

Chg    553  47  1    FPSHYXRFTFXMFTFVDTTAM  DP1RSNVYXHCSTAVCVPGQGVSC5 

FloundarZP  459    FPTHYKRTIFKMFTFVDSSTLEPQRRR  CTFTV  


Chg  500  465  PSCGRRXREI  SGSTXMI  SREEATIVSSX3WFTAT- B 
Chg  553  517  PSCNRXGXRDT3AAEQRKVBPKVVVSSG3VIMTAPQB 
FloundarZP      491   VQLSALVTQAAPVSRHATG 


32  pggpggpgypsKPae'agPgrPSKPgaPaa 

C'tlg  500  97  CCTCACCftACCCCAGCACCCTCncITATCCTTCGAAGCCTCftCCAACCCCACCftGCCTCACTATCCTTCGAACCCTCAGCAACCCCACCAG 

Fl  minder  2 P  2 -IB  CCACAGACTCCACACCAACCAACGTACCCACACCAACCACAGCACCCACAGCAACCACAGCAACCAAACTACCCACAGCAACCACAGCAG 

BO  PgTPggPTYpggpggpggpgaPKtfPggPaa 

62  pgYPggpggpggpgYPSKPgypsKpggpgg 

Chg   500  1B6  CCTCAGTATCCCCAGCAGCCTCAACAACCCCAGCACCCTCAGTATCCTTCGAAGCCTCAGTATCCTTCGAAGCCTCACCAACCCCAGCAG 

FloundvcZP  J  2  7  CCACAGCAACCACAGCAACCAAAGTACCCACAGCAACC.ACAGCAACCACACCAACCACAGCTAACCAAAGTACCCACAGCAACCACAGCAA 

109  rggrggPK  /pggpggrggpggPKipggpgg 
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Chg    427  1  MMMKWTVFCVVALALLGS  FCDAQ  -  GYA52GKPSKPQ  S  P  PT  Q  NQQQLQTFERELTWXYPDD 

MadakaL-5?  1  MM- XFTAVCLWLALLDGFCDAQHNYGXPS YPPTGSKTPQD PTQQKQLHZXZLTWRYPAD 

Mouse   2P3  1  MAS  SYFLFLCLLLCGGPELCNSQT  LWLLPGG 

Cat    Z?C  1  MGLSYGLPICFLLWAGTGLCYPPT    T  TED 

Human    ZP3A  1  MEL3YRLF1  CLLLWGS  TELC YP  QP  LWLLQGG 


Chg    427  60  ?QPDPEPNVPFSLRY?VPAATVAVECRZSIAHVSVXXDMFaTaQ.PINPNDLTLGN- -CAP 

MadalcaL-SF  5  0  PQPEAXPVVPFEQ.RYPVPAATVAVECREDLAHVXAXXDLFGI GQFXDBAOLTLGT  —  CPP 

House    ZP3  3  2  TPTPVGSSSP  VXVBCLZAZLVVTVSRDLyGTGXLVQPGDLTLGSEGCg.P 

Cat    ZPC  29  KTHPSLPSSP  SVVVECRHAWLVVNVSXSJLFGTGRLVKPADLTLGPENCSP 

Human    Z?  3  A  3  2  ASHPETSVQP  VLVBCQEATLMVMVSXDL7GTGKLIRAADLTLGPEACEP. 


Chg    427  118  VGEDSAAQVLXYXAELHQCGSQLMMTNDALVYTyVLMYNPTPLGSVRVVRTSQ.XAVXVXC 

MedaJtaL-SF  113  SAXDPAAQVLXFBSPIiQKCasVLTMTEOSLVTTRTLraMgXPLQSAPy^TSQA^VXYBq 

Mouse    ZP  3  81  R  VS  VDTD  -  VVR  FMAQLXECS  33.  VQMTXDALVrs  T?iLHD?R?.V  3  GL  S  I  LRTMRVEVP  I  2C 

Cat    ZPC  7  9  LISGDSDDTVRPXVXZ>HXCaNSVQVTBDAI>YYS?FCLHNFR9MGNLSIX.XZNRXBYPIS.C 

Human    Z?  3 A  8  1  :V3 MDTZ D VVS'lVGiKSCSHSHQ VTDDALVTS TFELHDPRP.VGNL 3 IYRTNR AE Z ? I 2C 


Chg   4  27  17  8  RYPRXHNVSSLPLDPLWVPPSXVXMAXXFLYFTMXLMTDDWMYQRP3TQ.YFLGCLXRXBV 

MadakaL-  SF  17  8  3YPRXHNVSSLALaPLWVPF3AAXMAZXFLYFTLXLTTDDFQ  FZRPSY^YFXGDLZHXBA 

Mousa    ZP  3  14  0  R YP R Q GUVS S H ? I  J ?T WVP PR AT V S AFS  L  R LME ZNWNT ZK3 A? T FHLGEVAHL 0 A 

Cat    ZPC  13  9  RYPRHStfVSSEAILPTWVPFRTTMLSBSXXiAjf  SLRLMEEDVVGSEXQ3PTFQLGDIjA3LC:A 

Human   ZP3A  141  RYPRQGHVSSQAIL9lTV«LSrRTTV7SRXXLTrSLRLllXENWNAEXRSPTFHEaDAAHL0.A 


Chg   427  238  TVRQYFHVPLRVY-VDRCVATIiSP- - DVTSSPNYXPXDHFGCIilDARITGSDSXZ- MARTQ 

MadaJcaL-  SF  23  8  TVXQ.rTHVPLRVTVORCVATI.SP-  -  DAN3SPS YATIDHYOCCLDGRITOSDSXT-  VSRPA 

Mouaa   ZP  3  200  3VQTGSHLPLQL7VDHCVAT.PSPL?DPNSS»YHFIVDFHGCIiVDG- LSESFSAKQVPRPR 

Cat    ZPC  199  3VHTGRHI?LRLFVDYCVATIiT--PDQ.NASPHHTIVDFHOCl;VDG-LSDASSAFEAPRPR 

Human   ZP  3  A  201  ZXHTGSRVPXiRLFVDHCYjVX.PT  —  PDQNASP YHT I VDFHGCLVDG  -  LTDASSAFKVPRPG 


Chg   427  29  5    aMHLQPQLBAFRFOMSDSGVIYXTCYLRAT3T3QAlaSQHRAC3TT  GGWRBASGVDG 

MedakaL-  SF  29  5   2HKLDFQ.LXA?R7'~GADSGMXTXTCHLXATSAAYPLDAEHRACSXX  QGWXSV3GADP 

Mousa    ZP  3  2S9  PETLQFTVDVFHF  .aSSRNTLXXTCHLKVAPANQI PDKLNKAC3FNKT S Q SWL ? VEGDAD 

Cat    ZPC  25  6  ?  ETLQFT  VDTTHF.  .-JDP  RNMXXXTCHLXVTP  A3R  VP  DQ.LNXACSFI  K  3  SNRWF  ?  VSG?  AD 

Human    ZP3A  258  PDTLQFTVDVPHFAtfDSRNMXXXTCHLXVTLASQDPDELNXACSFSKPSNSIffPPVEGPAD 

Chg    427  352    ACGSCSTNVTP  YTAPA  VTFASP  P  VVVTDGGOVTLPAPGS  —  PXVP  YNPRR 

MadakaL-SF  352    I  CAS  CSS  GG —  PEVHA  NAVVSHGTSTLSOOGHGTGXPSD- - P  SRR 

Mouaa    ZP  3  319  XCDCCSHGNCSNSSSSQFQIHGPRQWSXLVSRNRRHVTDEADVTVGPLIFLGXANDQTVE 

Cat    ZPC  316  ICNCCNXGSCGLQGRSWRLSHLDRPWHXMASRNRRHVTBEADITVGPLIFLGXAADRGVE 

Human    ZP3A  318  XCQCCNXGDCGTP  SHSRRQ  PHVMSQWS  RS ASRNRRHVTEEADVTVGPLX  FLDRRGDHEVB 

Chg   427  400    VRDVTQAXILXWEGV  VSLGPIPIMBRXL  

MadaJcaL- SF  39  3    TREAARTSVLXWBGD  VTLGPIPIEBXRV  

Mouaa    ZP  3  379    GWTASAQTSVAL-OLGLATVAFLTLAAXVLAVTRRCHSSS  YLVSLPQ 

Cat    ZPC  37  6  GSTSPHTS  —  VMVGXGLATVL3  LTLATXVLGLARRHHTASRPMX  CP V3ASQ 

Human    ZP  3 A  378    QWALPSDTSVVLLGVGLAVVV3LTLTAVILVLTRRCRTASHP  VSASE 


Figure  5.4  A  ClustalV  alignment  of  Chg  427,  against  the  medaka  L-SF  protein 
(Murata  et  al.  1995),  the  mouse  ZP3  (Ringuette  et  al.,  1988),  cat  ZPB 
(Harris  et  al.  1994),  and  human  ZP3A  (Chamberlin  and  Dean  1990)A 
conserved  core  region  sharing  sequence  identity  with  other  ZP  proteins 
and  designated  as  a  "ZP  domain"  (Bork  and  Sander,  1992)  is  denoted  by 
a  dark  line.  This  is  the  core  domain  used  for  drawing  the  tree  shown  in 
Figure  5.8. 
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to  the  mouse  sequence  that  they  contain,  thus  the  ZP1,  ZP2,  and  ZP3  subdivisions.  The 
fish  sequences  represented  in  the  tree  were  separated  into  two  major  subdivisions:  one 
containing  Chg  427,  medaka  L-SF,  and  the  three  carp  sequences,  that  was  grouped  with 
the  mouse  ZP3  subdivision;  and  another  containing  the  Chg  500,  Chg  553,  and  the 
flounder  ZP  that  was  grouped  with  the  mouse  ZP1  subdivision.  Of  the  eight  fish 
sequences  analyzed,  none  showed  significant  relatedness  with  the  mammalian  ZP2 
subdivision;  however,  bootstrap  values  at  the  node  dividing  the  ZP2  and  ZP3 
subdivisions  were  the  lowest  on  the  tree,  arguing  against  a  weighted  interpretation 
concerning  this  delineation. 

Northern  Blot  Analysis 

Northern  blot  analysis  using  three  separate  random-primed  [32P]probes  for  Chg 
550,  427,  and  553  revealed  Chg  mRNAs  present  in  liver  RNA  from  both  estrogen-treated 
males  and  spawning  females  (Fig.  5.5).  Furthermore,  when  20.0  fig  of  ovary  RNA  was 
blotted  next  to  2.0  /xg  of  liver  RNA,  Chg  transcripts  were  apparent  only  in  RNA  from 
the  liver  (Fig  5.6). 

Vitelline  envelope  proteins 

VEPs  were  isolated  from  ovarian  follicles  and  resolved  by  SDS-PAGE  into  three 
major  Coomassie  blue-staining  bands  at  estimated  molecular  weights  of  69,000,  60,000, 
and  46,000,  designated  as  VEP  69,  VEP  60,  and  VEP  46,  respectively  (Fig.  5.7).  At 
least  one  other  band  could  be  visualized  between  VEP  60  and  VEP  69,  but  appeared  too 


Figure  5.5     Northern  blot  analysis  using  Chg  500,  427,  and  553  as  probes. 

A)  Methylene  blue  staining  of  a  nylon  membrane  indicating 
equivalent  loading  of  six  lanes  with  total  RNA.  Lanes  a,  d.  and 
i  contain  RNA  kb  markers,  lanes  b,  e,  and  g  each  contain  15  fig 
of  the  same  total  liver  RNA  isolated  from  a  single  estrogen-treated 
male.  Lanes  c,  f,  and  h  contain  15  fig  of  total  liver  RNA  isolated 
from  a  single  female,  approximately  four  days  before  spawning. 
28s  and  18s  ribosomal  RNA  bands  are  indicated  in  total  RNA 
lanes,  suggesting  RNA  preparations  lacking  in  RNAse 
contamination. 

B)  Autoradiograph  of  the  same  nylon  membrane  after  being  cut 
into  three  pieces  and  hybridized  (65°C)  to  the  32P-labeled  random- 
primed  Chg  probes  indicated  above  the  blot.  Positions  of  RNA 
markers  are  shown  to  the  left,  indicating  4.4,  2.37,  and  1.35  kb 
RNA. 
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faint  to  isolate.  Amino  acid  analysis  revealed  that  extraordinarily  high  proline  and 
glutamine  compositions  were  present  in  VEP  69  and  60  (Table  5.1)  agreeing  well  with 
reported  VEP  compositions  from  other  teleosts  (Hyllner  et  al.,  1991,  1995;  Hamazaki 
et  al.,  1987).  The  amino  acid  compositions  of  the  isolated  VEPs  are  compared  to  those 
predicted  from  Chg  cDNA  translations  in  Table  5.1. 

Discussion 

We  present  the  predicted  primary  structure  of  three  liver-derived  proteins,  Chg 
500,  Chg  423,  and  Chg  553  (Fig.  5.2).  We  have  shown  that  mRNAs  hybridizing  to 
cDNA  probes  from  each  Chg  occur  in  the  liver  RNA  of  estrogen-treated  males  and 
spawning  females,  but  are  not  detectable  from  ovarian  RNA.  Furthermore,  the  predicted 
amino  acid  compositions  of  the  Chgs  are  similar  to  the  profiles  of  three  VEPs  isolated 
from  ovarian  follicles  (Table  5.1).  We  submit  that  although  the  Chgs  differ  from 
mammalian  ZP  proteins  by  way  of  being  estrogen-induced  and  synthesized  in  the  liver, 
they  are  in  fact,  related  groups  of  proteins,  as  evidenced  by  the  shared  identity  of  a  ZP 
domain.  Chg  500  and  553  can  be  more  specifically  grouped  as  homologs  to  the 
mammalian  ZP1  subfamily  of  molecules,  while  Chg  427  can  be  grouped  with  the 
mammalian  ZP3  subfamily  (Fig.  5.8). 

Northern  Analyses 


By  showing  no  indication  of  Chg  mRNA  from  20  /zg  of  ovarian  RNA  compared 
with  the  ample  Chg  signals  from  only  2  fig  of  liver  RNA  (Fig.  5.6),  we  provided  strong 


Figure  5.6     Northern  blot  analysis  testing  ovary  vs.  liver  expression  of  Chgs. 

A)  Methylene  blue  staining  of  a  nylon  membrane  blot  of  lanes 
containing  2.0  /xg  of  total  liver  RNA  next  to  loads  of  20  /xg  of 
total  ovarian  RNA,  from  two  identically-treated  female  fish. 
Lanes  a,  e,  and  i  contain  liver  RNA  from  fish  1,  while  lanes  b,  f, 
and  j,  contain  ten  times  more  RNA  isolated  form  the  ovary  of  fish 
1.  Likewise,  lanes  c,  g,  and  k  contain  liver  RNA  from  fish  2, 
while  lanes  d,  h,  and  1  contain  ten  times  more  total  RNA,  isolated 
from  the  ovary  of  fish  2.  RNA  kb  markers  are  indicated  on  the 
left  with  28s  and  18s  rRNA  bands  indicated  on  the  right. 

B)  Autoradiograph  showing  the  same  nylon  membrane  after  being 
cut  into  three  pieces  and  hybridized  (65°C)  with  the  random 
primed  [32P]  Chg  probe  indicated  above  the  blot.  Although  ten 
times  more  ovarian  RNA  than  liver  RNA  was  loaded  onto  the  gel, 
only  bands  from  the  lanes  containing  liver  RNA  hybridizing  to  the 
Chg  probes.  Absolutely  no  hybridization  was  seen  in  the  lanes 
containing  ovary  RNA. 
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evidence  that  the  Chgs  are  indeed  expressed  in  the  liver,  but  not  the  ovary  of  spawning 
females.  We  additionally  showed  that  the  Chgs  can  be  induced  in  males  by  injection 
with  estradiol  (Fig.  5.5).  From  the  cDNA  sequences,  we  expected  the  sizes  of  the 
mRNAs  encoding  Chgs  500,  427,  and  553  to  be  1.64,  1.67  and  1.82  kb,  respectively. 
In  northern  blots,  however,  the  Chg  500  probe  hybridized  to  a  band  estimated  to  be  1.9 
kb,  while  the  Chg  427  hybridized  to  two  bands,  at  1.7  and  1.4  kb,  and  the  Chg  553 
probe  hybridized  to  two  bands  at  2.3  and  1.8  kb.  Indication  of  doublet  mRNAs  by 
hybridization  to  Chgs  427  and  553  probes  was  observed  from  repeated  stringent 
hybridizations  using  different  individual  samples  as 

well  as  with  probes  representing  different  sections  of  the  cDNA  (not  shown).  We 
interpret  these  data  to  suggest  that  two  isoforms,  possibly  splicing  variants  of  Chgs  427 
and  553,  are  present  in  the  liver  total  RNA.  We  also  suggest  that  our  cDNA  clones 
probably  did  not  contain  the  total  amount  of  5'  untranslated  sequence,  consistent  with  a 
conservative  estimate  of  mRNA  sizes  as  compared  with  actual  mRNAs  indicated  by  the 
gels. 

The  Predicted  Structure  of  Chgs  500  and  553 

The  proline-glutamine-rich  domains  found  in  Chg  500  and  Chg  553,  along  with 
that  of  the  flounder  ZP  (Lyons  et  al.,  1993)  represent  a  novel  protein  domain  for 
vertebrates.  Although  high  proline  and  glutamic  acid/glutamine  compositions  had  long 
been  predicted  through  amino  acid  composition  analyses  of  VEPs  from  F.  heteroclitus 
(Kaighn,  1964)  and  other  fish  (Young  and  Smith,  1956;  Iuchi  and  Yamagami,  1976, 


Table  5.1  Amino  Acid  Composition,  Percent  of  Total 


Cha  427 

Chg  500 

Chg  5z3 

VEP  46 

VEP  50 

VEP  6 

ASN 

3.7 

2.1 

3.0 

ASP 

5.2 

3.8 

4.6 

ASX 

8.9 

5.9 

7.6 

11.4 

8.9 

7.6 

GLN 

5.7 

11.3 

8.7 

GLU 

5.2 

5.0 

4.6 

GLX 

10.9 

16.3 

13.3 

11.2 

14.4 

18.3 

SER 

6.7 

7.1 

7.4 

7.6 

7.8 

7.0 

GLY 

5.7 

6.3 

5.1 

7.6 

7.5 

7.0 

HIS 

1.7 

1.0 

1.1 

1.0 

20 

0.5 

ARG 

3.7 

4.0 

3.3 

4.1 

3.5 

3.0 

THR 

6.9 

6.7 

8.0 

9.9 

8.9 

8.1 

ALA 

7.2 

5.0 

4.9 

8.5 

5.1 

4.9 

PRO 

9.4 

13.4 

15.0 

8.5 

13.3 

15.9 

TYR 

4.7 

5.4 

5.7 

3.5 

5.3 

5.6 

VAL 

9.4 

5.9 

3.0 

9.0 

7.3 

5.7 

MET 

Z2 

1.3 

1.1 

0.0 

1.1 

0.0 

CYS 

2.5 

4.0 

3.4 

0.0 

0.2 

0.0 

ILE 

3.7 

5.0 

2.7 

2.8 

2.6 

4.6 

LEU 

6.9 

4.8 

5.3 

6.5 

5.5 

5.2 

PHE 

4.0 

2.9 

2.3 

4.1 

2.6 

2.7 

LYS 

4.4 

4.4 

4.6 

4.7 

4.2 

4.0 

TRP 

1.2 

0.6 

0.8 
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Figure  5.7  Isolation  of  three  vitelline  envelope  proteins  (VEPs)  by  SDS-PAGE.  VEP 
69,  VEP  60,  and  VEP  46  were  visualized  by  Coomassie  blue  staining, 
indicating  migration  patterns  according  to  estimated  molecular  weights  of 
69  kDa,  60  kDa,  and  46  kDa.  After  resolution  by  SDS-PAGE,  in  Tris- 
tricine  buffers,  the  three  VEPs  were  electroblotted  to  PVDF,  and 
submitted  for  protein  analyses.  An  additional  VEP,  indicated  by  a  weaker 
staining  band  near  65  kDa,  could  be  visualized,  but  was  not  isolated  for 
characterization. 
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Ohzu  and  Kusa,  1981;  Kobayashi,  1982;  Begovac  and  Wallace,  1989;  Hyllner  et  al., 
1991,  1995),  the  extensive  PQX  repeat  is  nonetheless  extraordinary.  The  finding  that 
(Pro-Glx-X)  peptides  were  specifically  released  from  the  lysed  chorions  of  medaka  (Lee 
et  al.,  1994)  offers  evidence  consistent  with  the  notion  that  components  of  Chg  500  and 
553  contribute  to  the  structure  of  the  hardened  chorion.  Insights  into  the  mechanism  of 
chorion  hardening  have  been  provided  by  the  studies  of  Hagenmaier  et  al.  (1985)  and 
Oppen-Berntsen  et  al.  (1990)  in  which  Glx-Lys  crosslinks  were  discovered  in  the 
chorions  of  fertilized  but  not  unfertilized  eggs.  A  somewhat  similar  crosslinking 
phenomenon  has  been  suggested  from  the  proline-rich  repeats  of  mussel  adhesive 
proteins,  where  highly  repetitive  motifs  containing  hydroxyproline,  lysine  and  tyrosine 
(also  modified  to  3,4-dihydroxyphenylalanine)  are  involved  in  the  formation  of 
underwater  adhesives  (Rzepecki  et  al.,  1991).  The  exact  mechanism  whereby  the 
vitelline  envelope  of  F.  heteroclitus  is  hardened  into  a  rigid  chorion  remains  a  mystery; 
however  we  suggest  that  the  high  content  of  Pro,  Gin,  Lys,  and  Tyr  found  within  the 
PQX  repeating  region  of  Chg  500  and  553  are  likely  to  play  significant  roles  in  this 
process. 

The  predicted  structure  of  Chg  427 

The  shortest  of  the  three  sequences  reported  here  is  that  of  Chg  427.  It  does  not 
contain  an  extensive  repeating  region  as  do  the  other  Chgs;  however  the  short  sequence 
(PGK  PSK  PQS  PPT  QNQ  QQL  Q)  contains  the  high  proline  and  glutamine  content 
characteristic  of  the  repeats  of  Chgs  500  and  553.  Chg  427  is  the  only  sequence  of  the 
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three  Chgs  that  contains  a  predicted  N-glycosylation  site.  Rather  than  sharing  identity 
with  Chgs  500  and  553  of  F.  heteroclitus,  Chg  427  is  most  similar  to  that  of  the  medaka 
L-SF  protein,  followed  closely  by  three  carp  "ZP3"  sequences  (not  shown).  These  five 
fish  sequences  contain  ZP  domains  that  share  highest  identity  to  the  mouse  ZP3 
subfamily  of  mammalian  ZPs  (Fig.  5.8).  Because  the  mouse  ZP3  subfamily  of  molecules 
is  implicated  as  the  primary  sperm  receptor  in  mammals,  it  is  tempting  to  postulate  a 
similar  role  for  Chg  427  as  a  likely  candidate  for  sperm  interaction  in  F.  heteroclitus. 
However,  this  postulation  is  significantly  hindered  by  the  well  established  existence  of 
a  micropyle  on  F.  heteroclitus  eggs  (Dumont  and  Brummett,  1980;  Selman  and  Wallace, 
1986).  The  micropyle  of  teleost  eggs  is  essentially  a  narrow  channel  through  the  chorion 
that  provides  homologous  spermatozoa  with  direct  access  to  the  oocyte  membrane 
(Dumont  and  Brummett,  1980;  reviews  by  Guraya,  1986;  Hart  1990).  Its  existence 
dismisses  the  necessity  for  most  of  the  fertilization-associated  interactions  that  have  been 
documented  to  occur  in  urchins  and  mammals,  including  sperm  binding,  induction  of  the 
acrosome  reaction,  and  the  burrowing  of  sperm  through  the  ZP  en  route  to  the  oocyte 
surface.  Therefore,  although  Chg  427  shares  identity  with  ZP3  molecules,  it  remains 
unclear  as  to  what  function  it  fulfills. 

The  ZP  Family  of  Proteins 

A  recent  review  by  Harris  et  al.  (1994)  has  attempted  to  lend  order  to  the 
currently  confusing  ZP  nomenclature  by  separating  all  ZP  proteins  into  three  groups: 
ZPA;  ZPB;  and  ZPC,  according  to  comparisons  by  protein  alignments.  Their  ZPA 


118 


CaiZPC  Human  ZD3A 


Figure  5.8  Parsimonious  tree  analysis  of  ZP  domains  from  seventeen  vertebrate  ZP 
homologs.  The  unrooted  tree  was  obtained  by  running  100  bootstrap 
replicates  of  a  heuristic  search  (PAUP  3.1;  Swofford,  1993)  through  a 
ClustalV  alignment,  containing  only  the  ZP  domains  of  selected  proteins, 
and  drawn  according  to  the  format  of  Fitch  parsimony  program.  Three 
divisions  of  ZP  homologs  resulted,  each  designated  by  a  separate  mouse 
ZP  protein:  ZP1  division,  ZP2  division,  and  ZP3  division.  Three  piscine 
proteins,  including  Chg  500  and  553  were  grouped  within  the  ZP1 
division,  while  five  piscine  proteins,  including  Chg  427  were  grouped 
within  the  ZP3  division.  Bootstrap  percentage  values  are  indicated 
adjacent  to  appropriate  nodes. 
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group  is  homologous  to  mouse  ZP2,  their  ZPB  group  is  homologous  to  mouse  ZP1,  and 
their  ZPC  group  is  homologous  to  mouse  ZP3.  Thus,  this  new  organization  scheme  does 
not  improve  on  the  original  grouping  offered  by  Wassarman  et  al.  (1988a,b)  based  on 
the  three  mouse  ZPs:  ZP1;  ZP2;  and  ZP3.  Therefore  in  our  discussions  of  similarity  to 
mammalian  ZP  proteins  we  refer  to  three  major  groups  of  mammalian  ZPs,  according 
to  molecular  identities  shared  with  mouse  ZP1,  ZP2  or  ZP3.  These  three  "subfamilies" 
are  nevertheless  contained  within  a  larger  family  of  related  ZP  proteins  that  can  be 
recognized  by  the  possession  of  a  conserved  region  designated  as  the  ZP  domain  (Bork 
and  Sander,  1992). 

Combining  the  present  molecular  data  describing  Chgs  with  the  reports  by  Lyons 
et  al.  (1993),  Murata  et  al.  (1995),  and  the  Genbank  entries  for  the  carp  ZP3  molecules, 
it  appears  that  the  eight  liver-derived  VEP  precursors  described  for  teleosts  can  be 
separated  into  two  distinct  groups:  one  containing  Chg  500,  Chg  553,  plus  the  flounder 
ZP  protein;  and  another  containing  Chg  427,  the  medaka  L-SF,  and  the  three  carp  ZP3 
sequences.  Furthermore,  these  two  fish  groups  share  identity  with  two  distinct 
mammalian  ZP  subfamilies:  Chgs  500  and  553  are  grouped  with  the  mouse  ZP1 
subfamily  while  Chg  427  is  grouped  with  mouse  ZP3  subfamily.  Bootstrapping  values 
indicate  a  high  confidence  value  associated  with  the  Chg  427-mouse  ZP3  subfamily 
division,  whereas  the  Chgs  500  and  553  might  be  expected  to  group  with  the  mouse  ZP2 
or  ZP1  subfamily.  Whether  Chgs  500  and  553  are  closer  related  to  the  mouse  ZP1  or 
ZP2  subfamily,  the  trend  remains  that  all  reported  piscine  molecules  separate  into  two 
rather  than  three  subdivisions.   The  lack  of  a  third  subtype  of  teleost  homolog  may 
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reflect  a  difference  in  the  construction  and  function  of  the  teleostean  vitelline  envelope 
from  that  of  mammals,  but  is  more  likely  due  to  a  lack  of  targeted  investigation.  For 
instance,  although,  we  report  the  amino  acid  compositions  of  three  VEPs:  69,  60,  and 
46,  there  is  at  least  one  other  F.  heteroclitus  VEP,  with  estimated  molecular  weight  of 
65,000  that  could  be  visualized  but  was  stained  too  faintly  to  isolate.  Thus,  other  minor 
F.  heteroclitus  VEPs  still  remain  that  may  represent  a  third  subclass  of  teleost  VEPs. 
It  is  also  possible  that  a  third  sub-type  may  be  synthesized  by  the  ovary  rather  than  the 
liver  of  teleosts,  explaining  why  the  studies  investigating  liver  cDNAs  have  not  yet 
discovered  it.  Results  obtained  with  the  pipefish  Syngnathus  scovelli  (Wallace  and 
Begovac,  1989)  as  well  as  a  preliminary  study  in  F.  heteroclitus  (Hamazaki  et  al. ,  1989b) 
suggest  that  at  least  one  VEP  may  indeed  be  synthesized  within  the  ovarian  follicle. 

By  this  preliminary  characterization  of  Chg  500,  427,  and  553,  estrogen-induced, 
liver-derived  precursors  to  the  vitelline  envelope,  we  hope  to  set  the  stage  for  further 
investigations  of  the  regulation,  structure  and  function  of  the  vitelline  envelope  and 
chorion.  Besides  being  used  to  study  development  of  the  ovarian  follicle,  the  Chgs 
should  provide  excellent  biomarkers  to  indicate  either  naturally  occurring,  or 
toxicological  states  of  estrogen  induction  in  fish. 
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CHAPTER  6 
GENERAL  SUMMARY 

In  this  dissertation,  I  have  presented  the  nucleotide  and  predicted  amino  acid 
sequences  of  five  Fundulus  heteroclitus  cDNAs:  two  vitellogenins  (Vtg  I  and  Vtg  II)  and 
three  choriogenins  (Chg  500,  Chg  427,  and  Chg  553).  All  five  of  these  protein  products 
are  synthesized  and  secreted  by  the  liver  under  estrogen  induction,  and  transported  by 
the  blood  to  the  ovary.  Vtgs  I  and  II  are  endocytosed  by  the  oocyte  and  processed  into 
liquid  phase  yolk  proteins.  In  contrast,  the  Chgs  are  probably  not  taken  up  by  the 
oocyte,  but  rather  laid  down  as  components  of  the  vitelline  envelope  and  thus  eventually 
contribute  to  the  structure  of  the  chorion. 

As  an  introduction,  I  described  in  Chapter  1,  the  historical  context  and  initial 
goals  of  the  project.  Probably  the  one  most  essential  task  accomplished  in  this  work,  the 
construction  of  an  estrogen-induced  liver  library,  was  completed  before  I  became 
affiliated  with  the  study,  by  Marion  Byrne,  Jyotshnabala  Kanungo,  and  Laura  Nelson. 
They  constructed  the  library  to  obtain  the  primary  structure  of  F.  heteroclitus  Vtg, 
hoping  to  answer  evolutionary  as  well  as  biochemical  questions.  After  the  initial 
investigators  disbanded,  I  became  involved  with  the  project,  first  as  a  compiler  of 
sequence  data,  but  eventually  leading  to  a  role  as  primary  caretaker  of  the  library. 
Chapter  1  concluded  with  a  description  of  some  of  my  unexpected  adventures  with  the 
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library,  primarily  due  to  fortuitous  annealing  events,  that  led  to  the  discovery  of  cDNAs 
coding  for  Vtg  II  and  the  three  choriogenins  (Chgs). 

In  order  to  complete  the  cDNA  sequence  of  Vtg  I,  two  small  overlapping  regions 
were  isolated  out  of  the  library  using  anchored  PCR  (Fig  2.1).  The  resulting  cDNA 
(51 12  bp)  and  predicted  amino  acid  sequences  (1704  residues)  of  Vtg  I  (gi:459202)  were 
described  in  Chapter  2  (Fig  2.2).  Alignment  of  the  F.  heteroclitus  Vtg  against  the  other 
known  vertebrate  Vtgs  revealed  30% -40%  sequence  identity  being  shared  among  the 
proteins  (Fig.  2.4).  The  sturgeon  Vtg  sequence  was  found  to  share  more  identity  with 
chicken  and  Xenopus  Vtgs  than  with  the  F.  heteroclitus  Vtg,  suggesting  that  the  F. 
heteroclitus  Vtg  reflects  a  more  derived  rather  than  ancestral  vertebrate  protein  (Fig  2.5). 
We  had  hoped  that  by  comparing  the  sequence  of  F.  heteroclitus  Vtg  with  the  Vtgs  of 
other  vertebrates,  we  might  find  an  explanation  for  why  F.  heteroclitus  yolk  proteins 
remain  in  a  non-crystalline  liquid  phase.  Analyses  predicting  secondary  structure  showed 
no  obvious  differences  to  account  for  structural  disparity.  Although  the  polyserine 
domain  of  the  F.  heteroclitus  Vtg  was  the  shortest  of  the  five  vertebrates,  it  possessed 
the  highest  relative  composition  of  serine.  We  hypothesized,  assuming  these  serines  were 
phosphorylated,  that  the  polyserine  domain  of  F.  heteroclitus  Vtg  may  be  more 
hydrophilic  and  polar  than  that  of  the  other  Vtgs,  perhaps  preventing  the  recombination 
of  phosvitin  and  lipovitellin  that  occurs  in  granular  or  crystalline  yolk.  A  graphical 
representation  was  used  to  emphasize  the  codon  usage  of  the  polyserine  domains  from 
sequenced  vertebrate  Vtgs  (Fig.  2.6).  A  specific  clustering  of  codons  was  observed: 
TCX  codons  generally  occurred  near  the  5'  end  of  the  domain,  and  AGY  codons,  often 
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the  more  frequently  used,  were  grouped  at  the  3'  end  of  the  domain.  As  the  first  teleost 
Vtg  to  be  completely  sequenced,  this  report  represented  an  important  step  toward  in 
gaining  comparative  information  on  Vtg  variability. 

In  Chapter  3,  the  sequence  of  Vtg  II  cDNA  and  predicted  protein  structure  was 
presented.  Vtg  II  shares  45%  overall  identity  with  Vtg  I,  and  30-40%  identity  with  the 
other  vertebrate  Vtgs.  The  polyserine  domain  of  Vtg  II  is  slightly  smaller  than  that  of 
Vtg  I,  but  more  surprising  is  the  polyserine  codon  organization.  The  trend  that  had  been 
observed  previously  for  Vtg  I  and  other  vertebrate  Vtgs  was  not  apparent  in  the 
polyserine  domain  of  Vtg  II.  Rather  than  a  clustering  of  TCX  and  AGY  codons,  each 
type  were  interspersed  throughout  the  length  of  the  domain.  In  a  comparison  of  mRNA 
expression,  Vtg  I  transcripts  were  10  times  more  prevalent  than  those  of  Vtg  II  from  total 
liver  RNA  isolated  from  spawning  females  and  estrogen-induced  males.  According  to 
these  data,  we  suggest  that  Vtg  I  is  the  primary  and  Vtg  II  a  secondary  yolk  precursor 
in  F.  heteroclitus. 

N-terminal  sequences  of  isolated  yolk  proteins  were  provided  in  Chapter  4.  We 
were  able  to  map  out  a  precursor-product  relationship  for  seven  yolk  proteins  by 
comparing  obtained  N-terminal  sequences  with  the  predicted  amino  acid  sequences 
derived  from  Vtg  I  and  Vtg  II.  A  PEST  site  found  in  the  Vtg  region  mapping  to  YP  125 
was  hypothesized  as  a  possible  factor  influencing  the  proteolytic  processing  of  YP  125 
during  oocyte  maturation  as  compared  to  YP  105,  which  does  not  contain  a  PEST  site. 

In  Chapter  5  the  cDNA  and  predicted  protein  sequences  of  the  choriogenins  was 
presented.  The  Chg  mRNAs  were  shown  to  be  expressed  by  the  liver  of  reproductive 
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females  and  estrogen-induced  males.  It  was  further  shown  that  Chg  MRNA  was  not 
indicated  in  RNA  isolated  from  the  ovaries  of  reproductive  females.  The  F.  heteroclitus 
Chgs  were  recognized  as  homologs  to  the  ZP  proteins  of  mammals  by  their  possession 
of  a  ZP  domain.  A  parsimonious  tree  analysis  of  the  ZP  domains  from  the  three  Chgs, 
five  other  fish  homologs,  and  nine  other  mammalian  ZP  proteins  separated  the  molecules 
into  three  major  subdivisions.  Chg  427  was  grouped  with  mouse  ZP3  and  its  homologs. 
Chg  500  and  Chg  533  were  grouped  with  the  mouse  ZP1  subdivision,  but  not 
significantly  separated  from  the  ZP2  subdivision.  We  isolated  vitelline  envelope  proteins 
(VEPs)  from  ovarian  follicles  and  obtained  their  amino  acid  compositions  for  comparison 
with  the  predicted  compositions  of  the  three  Chgs.  Although  similarities  existed  between 
the  Chgs  and  the  VEPs,  we  are  currently  awaiting  N- terminal  sequence  analysis  data  to 
provide  unambiguous  matches,  verifying  that  Chgs  are  processed  into  bona  fide  VEPs. 

The  sequences  of  these  five  liver-derived  molecules  provide  us  with  a  substantial 
amount  of  new  information  regarding  the  hepatic  contribution  to  oocyte  development. 
Not  only  do  we  have  data  describing  the  primary  structure  of  five  important  components 
of  the  oocyte,  but  we  have  convincing  evidence  that  they  originate  heterosynthetically  in 
the  liver  and  are  produced  under  estrogen  induction.  Although  these  results  provide 
exciting  opportunities  for  further  study,  the  flames  of  our  ambition  are  truly  fed  by  the 
estrogen-induced  liver  library  as  an  excellent  tool  to  aid  in  elucidation  of  molecules  that 
contribute  to  reproductive  processes. 
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